Commit | Line | Data |
---|---|---|
071359bb MM |
1 | <?xml version="1.0" encoding="UTF-8"?> |
2 | <!DOCTYPE script:module PUBLIC "-//OpenOffice.org//DTD OfficeDocument 1.0//EN" "module.dtd"> | |
3b4a7e19 MM |
3 | <script:module xmlns:script="http://openoffice.org/2000/script" script:name="Main" script:language="StarBasic">' SuperbChemistry version 2.2 |
4 | ' http://mattmccutchen.net/schem/ | |
5 | ' Written and maintained by Matt McCutchen <matt@mattmccutchen.net> | |
071359bb | 6 | ' |
3b4a7e19 MM |
7 | ' Applies superscript and subscript formatting to chemical formulas in |
8 | ' OpenOffice.org Writer documents. | |
071359bb | 9 | ' |
95510495 MM |
10 | ' Rules: |
11 | ' - Quantities [0-9]+ and charges [0-9]*[-+−] are recognized after an element | |
3b4a7e19 | 12 | ' symbol [A-Z][a-z]? or a closing delimiter [\])}] . Hyphens are converted |
95510495 MM |
13 | ' into real minus signs. |
14 | ' - A charge sign [-+−] is ignored if it is followed by a letter, digit, | |
15 | ' opening delimiter, or [<>] . (Charges should appear only at the end of a | |
16 | ' formula, and we want to avoid matching ordinary hyphens in text.) | |
17 | ' - When digits followed by a charge sign are recognized, the last digit | |
18 | ' becomes part of the charge and the remaining digits become the quantity. | |
19 | ' (Charges rarely have absolute value more than 9.) | |
3b4a7e19 MM |
20 | ' - In cases like X2-, we have to guess whether the digit is an atom/group |
21 | ' quantity or a charge amount. We guess atom/group quantity if X is H (NH4+), | |
22 | ' O (NO3-), a halogen (SbF6-, AlCl4-, etc.), or a closing parenthesis | |
23 | ' (Fe(OH)2+; the group likely would not have been parenthesized unless it had | |
24 | ' a quantity). Otherwise we guess charge amount (Fe3+). This heuristic | |
25 | ' should be right most of the time. | |
95510495 | 26 | ' |
071359bb MM |
27 | ' Examples: |
28 | ' C12345 ==> C_{12345} | |
29 | ' H+ ==> H^+ | |
30 | ' Cl- ==> Cl^- | |
31 | ' Fe3+ ==> Fe^{3+} | |
95510495 | 32 | ' SO42- ==> SO_4^{2-} |
071359bb | 33 | ' C1232+ ==> C_{123}^{2+} |
3b4a7e19 MM |
34 | ' N3- ==> N^{3-} |
35 | ' N|_3^- not recognized (| represents "no-width no break") | |
36 | ' NH4+ ==> NH_4^+ | |
95510495 | 37 | ' NO3- ==> NO_3^- |
3b4a7e19 MM |
38 | ' AlCl4- => AlCl_4^- |
39 | ' Fe(OH)2+ ==> Fe(OH)_2^+ | |
95510495 | 40 | ' O12 ==> O_{12} |
3b4a7e19 MM |
41 | ' y4- not recognized |
42 | ' x2 not recognized | |
43 | ' Foo2 not recognized | |
44 | ' TI-89 not recognized | |
45 | ' | |
46 | ' To format the current document, run the FormatDocument macro: go to Tools -> | |
47 | ' Macros -> Run Macro... -> My Macros -> SuperbChemistry -> Main -> | |
48 | ' FormatDocument -> Run. I realize that this is ugly. I tried to make the | |
49 | ' package install a menu item to format the document, but the resulting package | |
50 | ' caused OpenOffice.org to crash regularly (I didn't investigate why), so I | |
51 | ' abandoned that idea. Note that you can add a menu item as a user | |
52 | ' customization (Tools -> Customize), and I recommend it if you plan to use | |
53 | ' SuperbChemistry frequently. | |
54 | ' | |
55 | ' FormatDocument uses a sequence of regular expression find-and-replace | |
56 | ' operations since that was easy to implement and makes the rules easy to | |
57 | ' change. The operations appear in the undo history, so you can undo a | |
58 | ' formatting run by undoing the block of "Replace" entries at the top of the | |
59 | ' history. | |
60 | ' | |
61 | ' I would like to support formatting a selection, but the OpenOffice.org API | |
62 | ' does not appear to support replace-all within a selection. I could find | |
63 | ' within the selection and implement the replacing myself, but that is more | |
64 | ' work than I want to do. | |
65 | ' | |
66 | ' If SuperbChemistry makes a mistake (e.g., recognizes a "formula" that isn't | |
67 | ' or formats a formula incorrectly), you can correct the formatting yourself | |
68 | ' and prevent future runs of the macro from recognizing the offending text by | |
69 | ' inserting a "No-width no break" character in the middle of it. This character | |
70 | ' is available in the "Insert -> Formatting Mark" menu when "Tools -> Options -> | |
71 | ' Language Settings -> Languages -> Enhanced language support -> | |
72 | ' Enabled for complex text layout (CTL)" is enabled. | |
73 | ||
74 | ' ============================================================================== | |
071359bb MM |
75 | |
76 | ' Regular expression replace in the document, | |
77 | ' creating superscripts if superb > 0 or subscripts if superb < 0. | |
3b4a7e19 | 78 | ' Used by FormatDocument. |
071359bb MM |
79 | sub SuperbReplace(doc as object, searchStr as string, replaceStr as string, superb as integer) |
80 | ||
81 | dim rd as object | |
82 | rd = doc.createReplaceDescriptor() | |
83 | ||
95510495 | 84 | rd.SearchCaseSensitive = true |
071359bb MM |
85 | rd.SearchRegularExpression = true |
86 | rd.setSearchString(searchStr) | |
87 | rd.setReplaceString(replaceStr) | |
88 | ||
89 | if superb <> 0 then | |
90 | dim replaceAttrs(1) as new com.sun.star.beans.PropertyValue | |
91 | replaceAttrs(0).Name = "CharEscapement" | |
92 | if superb > 0 then | |
93 | replaceAttrs(0).Value = 33 | |
94 | else | |
95 | replaceAttrs(0).Value = -9 | |
96 | end if | |
97 | replaceAttrs(1).Name = "CharEscapementHeight" | |
98 | replaceAttrs(1).Value = 58 | |
99 | rd.setReplaceAttributes(replaceAttrs) | |
100 | end if | |
101 | ||
102 | doc.replaceAll(rd) | |
103 | ||
104 | end sub | |
105 | ||
106 | ' Formats the current document | |
107 | sub FormatDocument | |
108 | ||
95510495 MM |
109 | ' Idiom: Match something and tag it on the left or right with @x@ |
110 | ' for further processing. If the replacement text could use | |
3b4a7e19 MM |
111 | ' backreferences, this would be easier. (I think backreferences were added |
112 | ' since I originally wrote this code, but I see no need to rewrite it to take | |
113 | ' advantage of them. - Matt 2008-10-26) | |
95510495 | 114 | |
de16a29a MM |
115 | ' Tag candidate charges following symbols or ), but not in compound words, etc. |
116 | ' Acceptable next character. (Has to be before end of line to avoid matching @g@ tag itself.) | |
117 | SuperbReplace(ThisComponent, "([A-Z][a-z]?|[\])}])[0-9]*[-+−][^[({A-Za-z0-9<>]", "&@G@", 0) | |
118 | ' Retag in front. | |
119 | SuperbReplace(ThisComponent, ".@G@", "@g@&", 0) | |
120 | ' End of line. | |
121 | SuperbReplace(ThisComponent, "([A-Z][a-z]?|[\])}])[0-9]*[-+−]$", "&@g@", 0) | |
071359bb | 122 | |
3b4a7e19 MM |
123 | ' Some groups grab a single following digit as a quantity rather than a charge amount. |
124 | ' See detailed rationale above. | |
125 | SuperbReplace(ThisComponent, "(H|O|F|Cl|Br|I|\))[0-9]", "&@n@", 0) | |
071359bb | 126 | |
95510495 MM |
127 | ' Real minus signs in charges. |
128 | SuperbReplace(ThisComponent, "-@g@", "−@g@", 0) | |
071359bb | 129 | |
95510495 MM |
130 | ' Make charges: at most one digit. |
131 | SuperbReplace(ThisComponent, "[0-9]?[−+]@g@", "@q@&", 1) | |
071359bb | 132 | |
95510495 | 133 | ' Remove the O and ) markers in case of O57. |
071359bb MM |
134 | SuperbReplace(ThisComponent, "@n@", "", 0) |
135 | ||
95510495 MM |
136 | ' Tag quantities: as many digits as we can still grab. |
137 | SuperbReplace(ThisComponent, "([A-Z][a-z]?|[\])}])[0-9]+", "&@n@", 0) | |
071359bb | 138 | |
95510495 | 139 | ' Make quantities. |
071359bb MM |
140 | SuperbReplace(ThisComponent, "[0-9]+@n@", "&", -1) |
141 | ||
142 | ' Clean up all markers. | |
95510495 | 143 | SuperbReplace(ThisComponent, "@[gGnq]@", "", 0) |
071359bb MM |
144 | |
145 | end sub | |
146 | ||
147 | </script:module> |