Import SuperbChemistry version 2

[superbchemistry/superbchemistry.git] / extension / SuperbChemistry / Main.xba
diff --git a/extension/SuperbChemistry/Main.xba b/extension/SuperbChemistry/Main.xba

index e5a5a82..33ccbd1 100644 (file)
--- a/extension/SuperbChemistry/Main.xba
+++ b/extension/SuperbChemistry/Main.xba
@@ -1,19 +1,39 @@
  <?xml version="1.0" encoding="UTF-8"?>
  <!DOCTYPE script:module PUBLIC "-//OpenOffice.org//DTD OfficeDocument 1.0//EN" "module.dtd">
-<script:module xmlns:script="http://openoffice.org/2000/script" script:name="Main" script:language="StarBasic">&apos; Matt McCutchen&apos;s SuperbChemistry for OpenOffice, version 1
+<script:module xmlns:script="http://openoffice.org/2000/script" script:name="Main" script:language="StarBasic">&apos; Matt McCutchen&apos;s SuperbChemistry for OpenOffice, version 2
  &apos;
  &apos; Applies superscript and subscript formatting to chemical formulas in text.
  &apos;
+&apos; Rules:
+&apos; - Quantities [0-9]+ and charges [0-9]*[-+−] are recognized after an element
+&apos;   symbol [A-Z][a-z]? or a closing delimiter [])}] .  Hyphens are converted
+&apos;   into real minus signs.
+&apos; - A charge sign [-+−] is ignored if it is followed by a letter, digit,
+&apos;   opening delimiter, or [&lt;&gt;] .  (Charges should appear only at the end of a
+&apos;   formula, and we want to avoid matching ordinary hyphens in text.)
+&apos; - When digits followed by a charge sign are recognized, the last digit
+&apos;   becomes part of the charge and the remaining digits become the quantity.
+&apos;   (Charges rarely have absolute value more than 9.)
+&apos; - Exception: If a single digit follows O or a closing delimiter, that digit
+&apos;   is always the quantity.  (Handle NO3- and Fe(OH)2+.  I think oxygen is the
+&apos;   only element that frequently has a quantity as part of a +/-1 ion.  A group
+&apos;   is rarely parenthesized unless it has a quantity.)
+&apos;
  &apos; Examples:
  &apos; C12345 ==&gt; C_{12345}
  &apos; H+ ==&gt; H^+
  &apos; Cl- ==&gt; Cl^-
  &apos; Fe3+ ==&gt; Fe^{3+}
+&apos; SO42- ==&gt; SO_4^{2-}
  &apos; C1232+ ==&gt; C_{123}^{2+}
  &apos; N2- ==&gt; N^{2-}
-&apos; Exception for O and ): NO3- ==&gt; NO_3^-, Fe(OH)2- ==&gt; Fe(OH)_2^-
-&apos; But still O12 ==&gt; O_{12}
-&apos; 4+ ==&gt; 4+ (not a superscript by itself)
+&apos; NO3- ==&gt; NO_3^-
+&apos; Fe(OH)2- ==&gt; Fe(OH)_2^-
+&apos; O12 ==&gt; O_{12}
+&apos; y4- ==&gt; y4-
+&apos; x2 ==&gt; x2
+&apos; Foo2 ==&gt; Foo2
+&apos; TI-89 ==&gt; TI-89
  
  &apos; Regular expression replace in the document,
  &apos; creating superscripts if superb &gt; 0 or subscripts if superb &lt; 0.
@@ -23,6 +43,7 @@ sub SuperbReplace(doc as object, searchStr as string, replaceStr as string, supe
  dim rd as object
  rd = doc.createReplaceDescriptor()
  
+rd.SearchCaseSensitive = true
  rd.SearchRegularExpression = true
  rd.setSearchString(searchStr)
  rd.setReplaceString(replaceStr)
@@ -47,29 +68,37 @@ end sub
  &apos; Formats the current document
  sub FormatDocument
  
-&apos; Mark candidate superscripts so we know they follow letters or ).
-SuperbReplace(ThisComponent, &quot;[A-Za-z)][0-9]*[-+−]&quot;, &quot;&amp;@l@&quot;, 0)
+&apos; Idiom: Match something and tag it on the left or right with @x@
+&apos; for further processing.  If the replacement text could use
+&apos; backreferences, this would be easier.
+
+&apos; Tag candidate quantity/charges following symbols or ).
+SuperbReplace(ThisComponent, &quot;([A-Z][a-z]?|[\])}])[0-9]*[-+−]&quot;, &quot;&amp;@g@&quot;, 0)
+
+&apos; Disqualify + and - in compound words, etc.
+SuperbReplace(ThisComponent, &quot;@g@[[({A-Za-z0-9&lt;&gt;]&quot;, &quot;@G@&amp;&quot;, 0)
+SuperbReplace(ThisComponent, &quot;@G@@g@&quot;, &quot;&quot;, 0)
  
-&apos; O and ) grab a single digit.  Block it off from becoming a superscript.
-SuperbReplace(ThisComponent, &quot;[O)][0-9]&quot;, &quot;&amp;@n@&quot;, 0)
+&apos; O and )]} grab a single digit as quantity.
+SuperbReplace(ThisComponent, &quot;[\])}O][0-9]&quot;, &quot;&amp;@n@&quot;, 0)
  
-&apos; Real minus signs in superscripts.
-SuperbReplace(ThisComponent, &quot;-@l@&quot;, &quot;−@l@&quot;, 0)
+&apos; Real minus signs in charges.
+SuperbReplace(ThisComponent, &quot;-@g@&quot;, &quot;−@g@&quot;, 0)
  
-&apos; Make superscripts: at most one digit.
-SuperbReplace(ThisComponent, &quot;[0-9]?[−+]@l@&quot;, &quot;@q@&amp;&quot;, 1)
+&apos; Make charges: at most one digit.
+SuperbReplace(ThisComponent, &quot;[0-9]?[−+]@g@&quot;, &quot;@q@&amp;&quot;, 1)
  
-&apos; Remove the O and ) markers.
+&apos; Remove the O and ) markers in case of O57.
  SuperbReplace(ThisComponent, &quot;@n@&quot;, &quot;&quot;, 0)
  
-&apos; Mark off subscripts: as many digits as we can still grab.
-SuperbReplace(ThisComponent, &quot;[A-Za-z)][0-9]+&quot;, &quot;&amp;@n@&quot;, 0)
+&apos; Tag quantities: as many digits as we can still grab.
+SuperbReplace(ThisComponent, &quot;([A-Z][a-z]?|[\])}])[0-9]+&quot;, &quot;&amp;@n@&quot;, 0)
  
-&apos; Make subscripts.
+&apos; Make quantities.
  SuperbReplace(ThisComponent, &quot;[0-9]+@n@&quot;, &quot;&amp;&quot;, -1)
  
  &apos; Clean up all markers.
-SuperbReplace(ThisComponent, &quot;@[lnq]@&quot;, &quot;&quot;, 0)
+SuperbReplace(ThisComponent, &quot;@[gGnq]@&quot;, &quot;&quot;, 0)
  
  end sub