1 <?xml version="1.0" encoding="UTF-8"?>
2 <!DOCTYPE script:module PUBLIC "-//OpenOffice.org//DTD OfficeDocument 1.0//EN" "module.dtd">
3 <script:module xmlns:script="http://openoffice.org/2000/script" script:name="Main" script:language="StarBasic">' SuperbChemistry version 2.2
4 ' http://mattmccutchen.net/schem/
5 ' Written and maintained by Matt McCutchen <matt@mattmccutchen.net>
7 ' Applies superscript and subscript formatting to chemical formulas in
8 ' OpenOffice.org Writer documents.
11 ' - Quantities [0-9]+ and charges [0-9]*[-+−] are recognized after an element
12 ' symbol [A-Z][a-z]? or a closing delimiter [\])}] . Hyphens are converted
13 ' into real minus signs.
14 ' - A charge sign [-+−] is ignored if it is followed by a letter, digit,
15 ' opening delimiter, or [<>] . (Charges should appear only at the end of a
16 ' formula, and we want to avoid matching ordinary hyphens in text.)
17 ' - When digits followed by a charge sign are recognized, the last digit
18 ' becomes part of the charge and the remaining digits become the quantity.
19 ' (Charges rarely have absolute value more than 9.)
20 ' - In cases like X2-, we have to guess whether the digit is an atom/group
21 ' quantity or a charge amount. We guess atom/group quantity if X is H (NH4+),
22 ' O (NO3-), a halogen (SbF6-, AlCl4-, etc.), or a closing parenthesis
23 ' (Fe(OH)2+; the group likely would not have been parenthesized unless it had
24 ' a quantity). Otherwise we guess charge amount (Fe3+). This heuristic
25 ' should be right most of the time.
28 ' C12345 ==> C_{12345}
30 ' Cl- ==> Cl^-
31 ' Fe3+ ==> Fe^{3+}
32 ' SO42- ==> SO_4^{2-}
33 ' C1232+ ==> C_{123}^{2+}
34 ' N3- ==> N^{3-}
35 ' N|_3^- not recognized (| represents "no-width no break")
36 ' NH4+ ==> NH_4^+
37 ' NO3- ==> NO_3^-
38 ' AlCl4- => AlCl_4^-
39 ' Fe(OH)2+ ==> Fe(OH)_2^+
40 ' O12 ==> O_{12}
41 ' y4- not recognized
42 ' x2 not recognized
43 ' Foo2 not recognized
44 ' TI-89 not recognized
46 ' To format the current document, run the FormatDocument macro: go to Tools ->
47 ' Macros -> Run Macro... -> My Macros -> SuperbChemistry -> Main ->
48 ' FormatDocument -> Run. I realize that this is ugly. I tried to make the
49 ' package install a menu item to format the document, but the resulting package
50 ' caused OpenOffice.org to crash regularly (I didn't investigate why), so I
51 ' abandoned that idea. Note that you can add a menu item as a user
52 ' customization (Tools -> Customize), and I recommend it if you plan to use
53 ' SuperbChemistry frequently.
55 ' FormatDocument uses a sequence of regular expression find-and-replace
56 ' operations since that was easy to implement and makes the rules easy to
57 ' change. The operations appear in the undo history, so you can undo a
58 ' formatting run by undoing the block of "Replace" entries at the top of the
61 ' I would like to support formatting a selection, but the OpenOffice.org API
62 ' does not appear to support replace-all within a selection. I could find
63 ' within the selection and implement the replacing myself, but that is more
64 ' work than I want to do.
66 ' If SuperbChemistry makes a mistake (e.g., recognizes a "formula" that isn't
67 ' or formats a formula incorrectly), you can correct the formatting yourself
68 ' and prevent future runs of the macro from recognizing the offending text by
69 ' inserting a "No-width no break" character in the middle of it. This character
70 ' is available in the "Insert -> Formatting Mark" menu when "Tools -> Options ->
71 ' Language Settings -> Languages -> Enhanced language support ->
72 ' Enabled for complex text layout (CTL)" is enabled.
74 ' ==============================================================================
76 ' Regular expression replace in the document,
77 ' creating superscripts if superb > 0 or subscripts if superb < 0.
78 ' Used by FormatDocument.
79 sub SuperbReplace(doc as object, searchStr as string, replaceStr as string, superb as integer)
82 rd = doc.createReplaceDescriptor()
84 rd.SearchCaseSensitive = true
85 rd.SearchRegularExpression = true
86 rd.setSearchString(searchStr)
87 rd.setReplaceString(replaceStr)
89 if superb <> 0 then
90 dim replaceAttrs(1) as new com.sun.star.beans.PropertyValue
91 replaceAttrs(0).Name = "CharEscapement"
93 replaceAttrs(0).Value = 33
95 replaceAttrs(0).Value = -9
97 replaceAttrs(1).Name = "CharEscapementHeight"
98 replaceAttrs(1).Value = 58
99 rd.setReplaceAttributes(replaceAttrs)
106 ' Formats the current document
109 ' Idiom: Match something and tag it on the left or right with @x@
110 ' for further processing. If the replacement text could use
111 ' backreferences, this would be easier. (I think backreferences were added
112 ' since I originally wrote this code, but I see no need to rewrite it to take
113 ' advantage of them. - Matt 2008-10-26)
115 ' Tag candidate charges following symbols or ), but not in compound words, etc.
116 ' Acceptable next character. (Has to be before end of line to avoid matching @g@ tag itself.)
117 SuperbReplace(ThisComponent, "([A-Z][a-z]?|[\])}])[0-9]*[-+−][^[({A-Za-z0-9<>]", "&@G@", 0)
118 ' Retag in front.
119 SuperbReplace(ThisComponent, ".@G@", "@g@&", 0)
121 SuperbReplace(ThisComponent, "([A-Z][a-z]?|[\])}])[0-9]*[-+−]$", "&@g@", 0)
123 ' Some groups grab a single following digit as a quantity rather than a charge amount.
124 ' See detailed rationale above.
125 SuperbReplace(ThisComponent, "(H|O|F|Cl|Br|I|\))[0-9]", "&@n@", 0)
127 ' Real minus signs in charges.
128 SuperbReplace(ThisComponent, "-@g@", "−@g@", 0)
130 ' Make charges: at most one digit.
131 SuperbReplace(ThisComponent, "[0-9]?[−+]@g@", "@q@&", 1)
133 ' Remove the O and ) markers in case of O57.
134 SuperbReplace(ThisComponent, "@n@", "", 0)
136 ' Tag quantities: as many digits as we can still grab.
137 SuperbReplace(ThisComponent, "([A-Z][a-z]?|[\])}])[0-9]+", "&@n@", 0)
139 ' Make quantities.
140 SuperbReplace(ThisComponent, "[0-9]+@n@", "&", -1)
142 ' Clean up all markers.
143 SuperbReplace(ThisComponent, "@[gGnq]@", "", 0)