This post summarizes existing technologies to represent math digitally.

Presentation

Math has its own language with special symbolics. Therefore we need tools to visually present formulas. There are many ways to achieve that.

ASCIIMathML

The easiest way is to use ASCIIMathML wikipedia link format, which features intuitive way to describe formulas.

x = (-b +- sqrt(b^2 – 4ac)) / (2a)

LaTeX

Another way is LaTeX math format. LaTeX wikipedia link is a typesetting tool, that is very popular in math community and many publications are written in it.

x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}

MathML

A Presentation MathML wikipedia link is a format based on XML. MathML expression is built up out of tokens that are combined using higher-level elements, which control their layout. It is inconvenient to write it by hand, but it advantage is that it is natively supported by modern web browsers.

Others

Other formats includes OpenOffice and MS Office formats. They are both based on XML and are somewhat compatible with MathML.

Conclusion

This problem is solved well. We have MathML standard which is easily embedded n HTML and supported by modern web browsers. Other formats like ASCIIMathML and LaTeX can be converted using javascript Mathjax library.

Meaning

Another challenge is to capture meaning or semantics of math, not just its visual layout. When we capture meaning of math, we can use tools for automatic theorem prooving or inference.

Content MathML

Content MathML utilizes <apply> element that is used to construct formula using function application. Rusult is a expression tree represented in XML which is similar o LISP's S-expressions. Strict Content MathML, provides a subset of content MathML with a uniform structure and is designed to be compatible with OpenMath.

OpenMath

OpenMath wikipedia link is a XML based language for specifying meaning of mathematical formulas. OpenMath knowledge is represented in Content Dictionaries which consist of lists of Symbols. Each Symbol can have following properties:

  • name
  • description
  • role
  • Commented Mathematical properties (CMPs) - expressed in natural language
  • Formal Mathematical Properties (FMPs) - expressed formally in OpenMath
  • examples

There is a list of core OpenMath Content Dictionaries, which describes some of mathematical theories.

OMDoc

OMDoc wikipedia link OMDoc 1.2 specification in PDF (Open Mathematical Documents) is a semantic markup format for mathematical documents. It can include OpenMath and MathMl definitions. OMDoc documents represent knowledge in Theories. A Theory is a set of contextually related Statements (e.g. definitions, theorems, proofs, examples and the relations between them). Theories may import each other, thereby forming a graph. Seen as collections of symbol definitions, OMDoc theories are compatible to OpenMath content dictionaries.

OMDoc also specifies document ontology. To make creating documents easier, there is a tool QMath which enables to write documents in ascii plaintext and then transform them to OMDoc XML.

Classes:

  • Theory
  • Statement
  • Axiom, Symbol, Definition
  • Theorem, Lemma, Corollary, Proposition, Conjecture, FalseConjecture, Obligation, - - Postulate, Formula, Assumption, Rule
  • Example, Alternative, Proof

Conclusion

OMDoc is the leading specification. However, its complexity prevents widespread adoption and practical use. I personally think that most effort should be first put into collecting and consolidating math knowledge to central place. To prevent discouraging of contributors, this knowledge can be expressed only in natural language and does not have to have semantic meaning. After first stage completed, second stage would have goal to represent collected knowledge formally with semantic meaning.

Links

Sharing

One of the most important aspects of knowledge representation is ability to share and distribute knowledge. There exists some structured websites containing math knowledge.

There is a Mathematical Knowledge Management Interest Group that meets in annual conferences.

Classification

For classification of mathematical knowledge, two main schemes exist:

  • MSC (Mathematics Subject Classification) wikipedia link

    Uses identifiers like 00-XX General, 11-XX Number theory.

  • ICM (International Congress of Mathematicians) sectioning scheme

    It has following sections:

    • Logic and foundations,
    • Algebra,
    • Number Theory,
    • Algebraic and complex geometry,
    • Geometry,
    • Topology,
    • Lie theory and generalizations,
    • Analysis,
    • Functional analysis and applications,
    • Dynamical systems and ordinary differential equations,
    • Partial differential equations,
    • Mathematical physics,
    • Probability and Statistics,
    • Combinatorics,
    • Mathematical aspects of computer science,
    • Numerical analysis and scientific computing,
    • Control theory and optimization,
    • Mathematics in science and technology,
    • Mathematics education and popularization of mathematics,
    • History of Mathematics.

Here is a table of corresponding categories.

Existing websites

Wikipedia

Wikipedia is a great source and has huge number of articles about math subjects. In my opinion it also has some shortcomings. Most important is that math articles are usually too complex and are not suitable as a study text. Knowledge is represented as a list of articles instead of consistent theories.

Encyclopedia of Mathematics

Encyclopedia of Mathematics is an wiki-based encyclopedia with great number of entries.

PlanetMath

PlanetMath is a project with a goal to create a central repository for mathematical knowledge on the web. It has some interesting features like Automatic Reference Linking.

MathWorld

MathWorld is an extensive online mathematical resource maintained by company Wolfram Reserch which is known for creating of a computational software Mathematica wikipedia link.

Others

ActiveMath is a online learning system which utilizes OMDoc for storing mathematical knowledge.

There is a huge number of textbooks and learning texts in PDF. They do not use structured format, but the content can be used to create structured database (if license of that content allows it).

Learning tools

Existing systems

There some concepts like MathFlow and also some research systems:

Intelligent Computer Science Lecture Notes

Intelligent Computer Science Lecture Notes is a system for publishing interactive lectures. A lecture material is written is semantic LaTex format and then it is transformed to OMDoc and then to HTML for online interaction. This system is similar to my vision of learning software.

Here is the demo of the interface:

SWiM

SWiM is an OMDoc-based Semantic Wiki for Mathematical Knowledge Management. Interesting reading about this system and mathematical knowledge in genereal is Christoph Lange's PhD Thesis PDF link and Diploma thesis PDF link.

Screenshot of SWiM

Math specific learning use cases

Math learning system shares common desired features with learning system. I compiled additional useful use cases, which are specific to Math.

  • Proof assistance for better understanding.

    Show proof step by step, show application of operators and other theorems. Expand substitution of other symbols. This can be also used to show process of deriving non-trivial formulas.

  • Decoding math formulas into natural language.

    This might be really hard to implement and it is not crucial feature to have.



Published

15 December 2012