{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\"\"\" Created on November 13, 2023 // Updated on March 20, 2026 // @author: Sarah Shi \"\"\"\n",
    "\n",
    "import os\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "import mineralML as mm\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "%config InlineBackend.figure_format = 'png'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# mineralML Quickstart for Mapped EDS Data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook shows **how to load and run your quantitative EDS data through mineralML** with an example gabbroic nodule from the Galapagos: `09g3`. These data are from Gleeson et al., 2024 (find the paper here: https://doi.org/10.1093/petrology/egaf031) in this GitHub repository (https://github.com/gleesonm1/GleesonEtAl_JPet_2024_supplement/tree/main/Code_Figures/Data/LargeScaleMaps/Gabbro%20Samples). Please refer to the paper for more information about this sample. \n",
    "\n",
    "This is a five step process: \n",
    "1. Load a directory containing all your CSVs of mapped chemical data with `mm.load_df` (or `pd.read_csv` directly). [Optional] Convert the input from element to oxide wt%.\n",
    "2. Predict the mineral class with mineralML and automatically plot the mineral phase map, mineral phase counts, and prediction score histograms\n",
    "3. Plot prediction score map and individual oxide maps.\n",
    "4. Plot compositions of mapped minerals in various classification diagrams (ternary, quadrilateral).\n",
    "5. Plot chemical variation maps.\n",
    "\n",
    "I have conveniently (I hope!) wrapped all of these bits into one function, called `mm.run_map`.\n",
    "\n",
    "We loaded in the **mineralML** Python package as `mm`. **mineralML** has trained machine learning models for classifying minerals. This implementation aims to get your electron microprobe or quantitative EDS compositions classified and processed. We remove some degrees of freedom to simplify the process as much as possible. The minerals considered for this study include: Amphibole, Apatite, Biotite, Calcite, Chlorite, Epidote, Feldspar (Alkali Feldspar and Plagioclase), Garnet, Glass, Kalsilite, Leucite, Melilite, Muscovite, Nepheline, Olivine, Oxide (Rhombohedral_Oxides including Hematite-Ilmenite, Spinel_Group including Magnetite-Spinel), Pyroxene (Clinopyroxene, Orthopyroxene, Na-Pyroxene), Quartz, Rutile, Serpentine, Titanite, Tourmaline, and Zircon. \n",
    "\n",
    "CSV files containing your mapped data in oxide weight percentages (or converted to) is necessary. Find an example [here](https://github.com/sarahshi/mineralML/tree/main/docs/examples/Maps/09g3), reproduced from the Gleeson et al., 2024 GitHub linked above. The necessary oxides are SiO$_2$, TiO$_2$, Al$_2$O$_3$, FeO$_t$, MnO, MgO, CaO, Na$_2$O, K$_2$O, Cr$_2$O$_3$, P$_2$O$_5$, and ZrO$_2$ (if you are aiming to classify zircon). For the oxides not analyzed for specific minerals, the preprocessing will fill in the nan values as 0. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Load and prepare data for analysis\n",
    "\n",
    "Here, we will work with data that are in elemental weight percent. This means that we will have to do a conversion to oxide weight percent. The data directory is `Maps/09g3`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Find your directory of mapped mineral data, stored in Maps/09g3. \n",
    "# This code identifies any file with CSV and appends it to the map. \n",
    "\n",
    "base = \"Maps\"\n",
    "map_dirs = []\n",
    "for root, subdirs, files in os.walk(base):\n",
    "    # Skip any path that includes 'Ignore' in its folder names\n",
    "    if \"Ignore\" in root.split(os.sep):\n",
    "        continue\n",
    "    \n",
    "    if any(f.lower().endswith(\".csv\") for f in files):\n",
    "        map_dirs.append(root)\n",
    "\n",
    "print(map_dirs)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Apply the trained neural network with mm.run_map\n",
    "\n",
    "We will use `mm.run_map` which will return all you need!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Inspect the inputs and outputs of mm.run_map\n",
    "help(mm.run_map)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Here is our all in one function! Read the inputs and outputs provided above. \n",
    " \n",
    "output = mm.run_map(next((s for s in map_dirs if '09g3' in s), None), # provide the directory of interest. alternatively, you can provide the preloaded dictionary of oxides. \n",
    "                    renormalize=False, # optionally renormalize totals to 100 wt%. \n",
    "                    total_threshold=None, # optionally filter out SiO2 values below a given value, for when EDS picks up epoxy pixels\n",
    "                    pred_score_threshold=0.6, # provide a prediction score threshold. here, i only want values with >= 0.6 prediction score.\n",
    "                    min_frac=0.01, # provide a minimum pixel fraction for the phase to be displayed\n",
    "                    units='element_wt%', # provide the unit. can choose 'element_wt%' or 'oxide_wt%'\n",
    "                    phases=['Plagioclase', 'Clinopyroxene', 'Orthopyroxene', 'Oxide', 'Olivine', 'Glass'], # phases of interest\n",
    "                    scalebar_um=50, # define size of scalebar desired, in microns\n",
    "                    pixel_size_um=2, # define size of each pixel of scalebar, in microns \n",
    "                    scalebar_loc='upper right', # specify location for scalebar\n",
    "                    scalebar_col='white', # specify color for scalebar\n",
    "                    )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Inspect what is in the outputs\n",
    "output.keys()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's say you now want to work with these data in dataframe form rather than dictionary form. How would you do this? Access `output['df_pred']` to retrieve the predictions dataframe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Pull the dataframe of predictions\n",
    "df_pred = output['df_pred']\n",
    "display(df_pred)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Plot oxide concentration maps and prediction score maps\n",
    "\n",
    "Let's plot the original oxide maps loaded from the directory. We can examine how well the predicted phase map matches some of the observations made in oxide space. We have a handy function for doing so, with `mm.plot_oxide_map`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = mm.plot_oxide_map(\n",
    "    output, # take the output from run_map\n",
    "    oxide_name='SiO2', # specify the oxide of interest\n",
    "    scalebar_um=50, # define size of scalebar desired, in microns\n",
    "    pixel_size_um=2, # define size of each pixel of scalebar, in microns \n",
    "    scalebar_loc='upper right', # specify location for scalebar\n",
    "    scalebar_col='black', # specify color for scalebar\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's plot the prediction scores from the output, in mapped form. This allows for further investigation to determine where predictions are more and less certain. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = mm.plot_score_map(\n",
    "    output, # take the output from run_map\n",
    "    scalebar_um=50, # define size of scalebar desired, in microns\n",
    "    pixel_size_um=2, # define size of each pixel of scalebar, in microns \n",
    "    scalebar_loc='upper right', # specify location for scalebar\n",
    "    scalebar_col='black', # specify color for scalebar\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Plot compositions of mapped minerals in various classification diagrams (ternary, quadrilateral).\n",
    "\n",
    "We can do some more with mineralML now. Let's plot all the feldspars, pyroxenes, and spinels in ternary space. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Identify the phases present."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Here are all our feldspars \n",
    "fspars = df_pred[df_pred.Predict_Mineral == 'Plagioclase']\n",
    "display('Feldspars:', fspars)\n",
    "\n",
    "# Here are all our pyroxenes \n",
    "pxs_names = ['Clinopyroxene', 'Orthopyroxene']\n",
    "pxs = df_pred[df_pred.Predict_Mineral.isin(pxs_names)]\n",
    "display('Pyroxenes:', pxs)\n",
    "\n",
    "# Here are all our oxides \n",
    "ox_names = ['Oxide']\n",
    "oxs = df_pred[df_pred.Predict_Mineral.isin(ox_names)]\n",
    "display('Oxides:', oxs)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot these feldspars, pyroxenes, and spinels! "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Use FeldsparClassifier to examine at the component space (XAn, XAb, XOr)\n",
    "fspar_comp = mm.FeldsparClassifier(fspars).calculate_components()\n",
    "display(fspar_comp)\n",
    "\n",
    "# Use FeldsparClassifier to plot up these data. \n",
    "fig = mm.FeldsparClassifier(fspars).plot()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Use PyroxeneClassifier to examine at the component space (En, Wo, Fs). If sodic pyroxenes are also within this input, this will plot them up in the sodic pyroxene ternary\n",
    "pxs_comp = mm.PyroxeneClassifier(pxs).calculate_components()\n",
    "display(pxs_comp)\n",
    "\n",
    "# Use PyroxeneClassifier to plot up these data. \n",
    "fig = mm.PyroxeneClassifier(pxs).plot()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Use OxideClassifier to examine at the component space.\n",
    "oxs_comp = mm.OxideClassifier(oxs).calculate_components()\n",
    "display(oxs_comp)\n",
    "\n",
    "# Use OxideClassifier to plot up these data. \n",
    "fig = mm.OxideClassifier(oxs).plot()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You might note that the structure of these three `...Classifier` classes is identical. That is intentional! `mm.FeldsparClassifier`, `mm.PyroxeneClassifier`, and `mm.OxideClassifier` all have `calculate_components` and `plot` methods embedded."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Plot chemical variation maps\n",
    "\n",
    "We know the mineralogy now. What if you now want to inspect the chemical variation within the individual crystals? Pull the component maps created for each sample and plot this up with `mm.plot_component_composite`.\n",
    "\n",
    "This function currently does this calculation for feldspars, pyroxenes, olivines, and amphibole. This can easily be expanded with all the stoichiometric mineral functions. Here, I will just show this for these common igneous phases."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Inspect what’s available:\n",
    "print(sorted(output[\"component_maps\"].keys()))\n",
    "\n",
    "#  Plot map highlighting internal compositional variation\n",
    "fig = mm.plot_component_composite(output, # specify output from above\n",
    "                                  title=\"09g3\", # optionally add a title to this plot\n",
    "                                  phases=['Plagioclase', 'Clinopyroxene', 'Orthopyroxene', 'Oxide', 'Olivine', 'Glass'], # phases of interest\n",
    "                                  smooth_sigma=0.25, # add a Gaussian blur to smooth compositional data, usually turned off. \n",
    "                                  scalebar_um=50, # define size of scalebar desired, in microns\n",
    "                                  pixel_size_um=2, # define size of each pixel of scalebar, in microns \n",
    "                                  scalebar_loc='upper right', # specify location for scalebar\n",
    "                                  scalebar_col='black', # specify color for scalebar\n",
    "                                  )\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "One could alternatively use all the functions within `mineralML.mapping` to do these same things, in a more stepwise manner. Look through the documentation if you would like to use individual bits of this code."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "science",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.14"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}