Explore our Science


Natural Kinds Clustering | Predicting formational environments of mineral samples

Mineral modes of formation provide insights into Earth’s co-evolving geosphere and biosphere, and also have the potential to illustrate otherwise obscure aspects of planetary evolution. Due to the limitation of classification criteria (largely based on unique combinations of idealized major element composition and crystal structure), the modern mineral classification system does not offer insights into diverse modes of origin for each mineral. As a case study, here we use natural clustering (unsupervised machine learning) to divide pyrite into different clusters and link the clusters with pyrite formation environment. A variety of deposit types of pyrite (e.g., iron oxide copper-gold, orogenic Au, porphyry Cu, sedimentary exhalative, volcanic-hosted massive sulfide deposits, and barren sedimentary pyrite) are used to evaluate the clustering process. An array of trace elements determined by LA-ICP-MS are used as predictor variables. Different clustering algorithms are employed in this study and their dissimilarities in model outputs are highlighted. Using the natural clustering technique, we are able to divide the formation environment of pyrite into three categories — Low temperature, medium temperature and high temperature. This study has implications for the elucidation of mineral-forming environments and could be applied to clustering of a wide range of condensed planetary materials with different paragenetic origins.


Association Analysis | Predicting the location of previously unknown mineral deposits

The oldest minerals are surviving materials from the formation of our solar system and they provide information about the evolution of Earth and other planets. Mindat (mindat.org), the Mineral Evolution Database (RRUFF.info/Evolution), and the Global Earth Mineral Inventory are some of the well known datasets in the field of mineralogy, which contain data about almost all known localities on Earth where minerals have been found. The increase in the amount and accuracy of mineral data and the improvements in technological resources make it possible to explore and answer large, outstanding scientific questions, such as, understanding the mineral assemblages on Earth and how they compare to assemblages and localities on other planets. In this contribution, we present an affinity analysis method to: 1) Predict unreported minerals at an existing locality. 2) Predict localities for a set of known minerals.

Association Analysis, or Market Basket Analysis, is a machine learning method that uses mined association rules to find interesting patterns in the data. The strength of the rules is identified using some measures of interestingness, such as ‘lift’. For example, when the occurrence of a mineral predicted with high confidence at a given locality is unexpected (low support), the rule used for such a prediction is considered ‘very interesting’. Successful implementation of this methodology will greatly aid the mineral discovery process.


Label Distribution Learning | Estimating the chemical composition of minerals on Mars

To better understand the formational conditions and geologic history of the minerals found in by NASA MSL rover Curiosity in Gale crater, Mars the CheMin X-ray diffractometer team developed a crystal-chemical method to predict limited chemical compositions of the minerals observed in the CheMin samples [1,2]. In this study, we adapt a machine learning technique, Label Distribution Learning (LDL) [3], to predict multicomponent chemical compositions of Gale crater mineral phases, thereby allowing for more detailed petrologic interpretation of the geologic history of the martian surface.

LDL is a novel framework for classification problems with small datasets and has been widely applied to facial recognition problems such as age estimation. In this study, we adapt the LDL algorithm such that it can predict chemical elements (labels) and their abundances (degrees) for each martian mineral sample, based on crystallographic parameters. We evaluate performance using distance and similarity between label distributions as well as mean square error and also compare the results to traditional machine learning methods.

[1] Morrison et al. (2017) Am Min, 103(6): 848-856 [2] Morrison et al. (2017) Am Min, 103(6): 857-871 [3] Geng (2016) IEEE Transactions on Knowledge and Data Engineering, 28(7), 1734-1748


Minerals vs Microbes | Exploring the complex relationships between microbial populations and their geochemical environments

The  reciprocal  feedbacks  between  microorganisms  and their environment have governed much of the coevolution of the  biosphere,  geosphere,  and  atmosphere  throughout geological  time.  Evidence  from  the  rock  record  highlights massive shifts in redox chemistry, trace metal availability, and primitive respiration during ancient Earth that may have been driven at  least  partially by  changes  in plate  tectonics  and volcanism. Our understanding  of how deep  subsurface processes in modern environments influence the trajectory of microbial  evolution  is  limited.  To better  characterize  the interactions  between  microorganisms  and  their  environment, we  sequenced  35  metagenomes  from  microbial  communities along  the  Costa  Rica  volcanic  arc,  where  sites  varied significantly in terms of pH (0.85 to 9.75), temperature (26 to 88qC),  sulfate  concentrations  (0.03  to  99.2  mM),  and molecular hydrogen (<0.001 to 11.7 mM). Diverse pathways of carbon fixation were observed across most  samples,  including  the  Calvin-Benson  Cycle and  the Wood-Ljungdahl pathway. Network  analysis  showed sulfate and  hydrogen negatively  correlated with  genes  involved  in these  pathways, includingcbb3-type cytochromecoxidase. Sulfate also  had  a  negative  relationship  with  glycolysis, indicating that nutrient release from the deep subsurface may play  a  role  in  shaping both  chemolithotrophic and heterotrophic communities at the surface.


Network Analysis | Exploiting the multivariate, multidimensional nature of complex evolving systems

A fundamental goal of mineralogy and petrology is the deep understanding of mineral phase relation-ships and the consequent spatial and temporal patterns of mineral coexistence in rocks, ore bodies, sediments, meteorites, and other natural polycrystalline materials. The multi-dimensional chemical complexity of such mineral assemblages has traditionally led to experimental and theoretical consideration of 2-, 3-, or n-component systems that represent simplified approximations of natural systems. Network analysis provides a dynamic, quantitative, and predictive visualization framework for employing “big data” to explore complex and otherwise hidden higher-dimensional patterns of diversity and distribution in such mineral systems. We introduce and explore applications of mineral network analysis, in which mineral species are represented by nodes, while coexistence of minerals is indicated by lines between nodes. This approach provides a dynamic visualization platform for higher-dimensional analysis of phase relationships, because topologies of equilibrium phase assemblages and pathways of mineral reaction series are embedded within the networks. Mineral networks also facilitate quantitative comparison of lithologies from different planets and moons, the analysis of coexistence patterns simultaneously among hundreds of mineral species and their localities, the exploration of varied paragenetic modes of mineral groups, and investigation of changing patterns of mineral occurrence through deep time. Mineral network analysis, furthermore, represents an effective visual approach to teaching and learning in mineralogy and petrology.


Tectonic controls on mineralization | Integrating mineralogical data resources with the EarthByte GPlates plate tectonic reconstruction platform