| 1 | <?xml version="1.0" encoding="UTF-8"?> |
| 2 | <!DOCTYPE script:module PUBLIC "-//OpenOffice.org//DTD OfficeDocument 1.0//EN" "module.dtd"> |
| 3 | <script:module xmlns:script="http://openoffice.org/2000/script" script:name="Main" script:language="StarBasic">' SuperbChemistry version 2.2 |
| 4 | ' http://mattmccutchen.net/schem/ |
| 5 | ' Written and maintained by Matt McCutchen <matt@mattmccutchen.net> |
| 6 | ' |
| 7 | ' Applies superscript and subscript formatting to chemical formulas in |
| 8 | ' OpenOffice.org Writer documents. |
| 9 | ' |
| 10 | ' Rules: |
| 11 | ' - Quantities [0-9]+ and charges [0-9]*[-+−] are recognized after an element |
| 12 | ' symbol [A-Z][a-z]? or a closing delimiter [\])}] . Hyphens are converted |
| 13 | ' into real minus signs. |
| 14 | ' - A charge sign [-+−] is ignored if it is followed by a letter, digit, |
| 15 | ' opening delimiter, or [<>] . (Charges should appear only at the end of a |
| 16 | ' formula, and we want to avoid matching ordinary hyphens in text.) |
| 17 | ' - When digits followed by a charge sign are recognized, the last digit |
| 18 | ' becomes part of the charge and the remaining digits become the quantity. |
| 19 | ' (Charges rarely have absolute value more than 9.) |
| 20 | ' - In cases like X2-, we have to guess whether the digit is an atom/group |
| 21 | ' quantity or a charge amount. We guess atom/group quantity if X is H (NH4+), |
| 22 | ' O (NO3-), a halogen (SbF6-, AlCl4-, etc.), or a closing parenthesis |
| 23 | ' (Fe(OH)2+; the group likely would not have been parenthesized unless it had |
| 24 | ' a quantity). Otherwise we guess charge amount (Fe3+). This heuristic |
| 25 | ' should be right most of the time. |
| 26 | ' |
| 27 | ' Examples: |
| 28 | ' C12345 ==> C_{12345} |
| 29 | ' H+ ==> H^+ |
| 30 | ' Cl- ==> Cl^- |
| 31 | ' Fe3+ ==> Fe^{3+} |
| 32 | ' SO42- ==> SO_4^{2-} |
| 33 | ' C1232+ ==> C_{123}^{2+} |
| 34 | ' N3- ==> N^{3-} |
| 35 | ' N|_3^- not recognized (| represents "no-width no break") |
| 36 | ' NH4+ ==> NH_4^+ |
| 37 | ' NO3- ==> NO_3^- |
| 38 | ' AlCl4- => AlCl_4^- |
| 39 | ' Fe(OH)2+ ==> Fe(OH)_2^+ |
| 40 | ' O12 ==> O_{12} |
| 41 | ' y4- not recognized |
| 42 | ' x2 not recognized |
| 43 | ' Foo2 not recognized |
| 44 | ' TI-89 not recognized |
| 45 | ' |
| 46 | ' To format the current document, run the FormatDocument macro: go to Tools -> |
| 47 | ' Macros -> Run Macro... -> My Macros -> SuperbChemistry -> Main -> |
| 48 | ' FormatDocument -> Run. I realize that this is ugly. I tried to make the |
| 49 | ' package install a menu item to format the document, but the resulting package |
| 50 | ' caused OpenOffice.org to crash regularly (I didn't investigate why), so I |
| 51 | ' abandoned that idea. Note that you can add a menu item as a user |
| 52 | ' customization (Tools -> Customize), and I recommend it if you plan to use |
| 53 | ' SuperbChemistry frequently. |
| 54 | ' |
| 55 | ' FormatDocument uses a sequence of regular expression find-and-replace |
| 56 | ' operations since that was easy to implement and makes the rules easy to |
| 57 | ' change. The operations appear in the undo history, so you can undo a |
| 58 | ' formatting run by undoing the block of "Replace" entries at the top of the |
| 59 | ' history. |
| 60 | ' |
| 61 | ' I would like to support formatting a selection, but the OpenOffice.org API |
| 62 | ' does not appear to support replace-all within a selection. I could find |
| 63 | ' within the selection and implement the replacing myself, but that is more |
| 64 | ' work than I want to do. |
| 65 | ' |
| 66 | ' If SuperbChemistry makes a mistake (e.g., recognizes a "formula" that isn't |
| 67 | ' or formats a formula incorrectly), you can correct the formatting yourself |
| 68 | ' and prevent future runs of the macro from recognizing the offending text by |
| 69 | ' inserting a "No-width no break" character in the middle of it. This character |
| 70 | ' is available in the "Insert -> Formatting Mark" menu when "Tools -> Options -> |
| 71 | ' Language Settings -> Languages -> Enhanced language support -> |
| 72 | ' Enabled for complex text layout (CTL)" is enabled. |
| 73 | |
| 74 | ' ============================================================================== |
| 75 | |
| 76 | ' Regular expression replace in the document, |
| 77 | ' creating superscripts if superb > 0 or subscripts if superb < 0. |
| 78 | ' Used by FormatDocument. |
| 79 | sub SuperbReplace(doc as object, searchStr as string, replaceStr as string, superb as integer) |
| 80 | |
| 81 | dim rd as object |
| 82 | rd = doc.createReplaceDescriptor() |
| 83 | |
| 84 | rd.SearchCaseSensitive = true |
| 85 | rd.SearchRegularExpression = true |
| 86 | rd.setSearchString(searchStr) |
| 87 | rd.setReplaceString(replaceStr) |
| 88 | |
| 89 | if superb <> 0 then |
| 90 | dim replaceAttrs(1) as new com.sun.star.beans.PropertyValue |
| 91 | replaceAttrs(0).Name = "CharEscapement" |
| 92 | if superb > 0 then |
| 93 | replaceAttrs(0).Value = 33 |
| 94 | else |
| 95 | replaceAttrs(0).Value = -9 |
| 96 | end if |
| 97 | replaceAttrs(1).Name = "CharEscapementHeight" |
| 98 | replaceAttrs(1).Value = 58 |
| 99 | rd.setReplaceAttributes(replaceAttrs) |
| 100 | end if |
| 101 | |
| 102 | doc.replaceAll(rd) |
| 103 | |
| 104 | end sub |
| 105 | |
| 106 | ' Formats the current document |
| 107 | sub FormatDocument |
| 108 | |
| 109 | ' Idiom: Match something and tag it on the left or right with @x@ |
| 110 | ' for further processing. If the replacement text could use |
| 111 | ' backreferences, this would be easier. (I think backreferences were added |
| 112 | ' since I originally wrote this code, but I see no need to rewrite it to take |
| 113 | ' advantage of them. - Matt 2008-10-26) |
| 114 | |
| 115 | ' Tag candidate charges following symbols or ), but not in compound words, etc. |
| 116 | ' Acceptable next character. (Has to be before end of line to avoid matching @g@ tag itself.) |
| 117 | SuperbReplace(ThisComponent, "([A-Z][a-z]?|[\])}])[0-9]*[-+−][^[({A-Za-z0-9<>]", "&@G@", 0) |
| 118 | ' Retag in front. |
| 119 | SuperbReplace(ThisComponent, ".@G@", "@g@&", 0) |
| 120 | ' End of line. |
| 121 | SuperbReplace(ThisComponent, "([A-Z][a-z]?|[\])}])[0-9]*[-+−]$", "&@g@", 0) |
| 122 | |
| 123 | ' Some groups grab a single following digit as a quantity rather than a charge amount. |
| 124 | ' See detailed rationale above. |
| 125 | SuperbReplace(ThisComponent, "(H|O|F|Cl|Br|I|\))[0-9]", "&@n@", 0) |
| 126 | |
| 127 | ' Real minus signs in charges. |
| 128 | SuperbReplace(ThisComponent, "-@g@", "−@g@", 0) |
| 129 | |
| 130 | ' Make charges: at most one digit. |
| 131 | SuperbReplace(ThisComponent, "[0-9]?[−+]@g@", "@q@&", 1) |
| 132 | |
| 133 | ' Remove the O and ) markers in case of O57. |
| 134 | SuperbReplace(ThisComponent, "@n@", "", 0) |
| 135 | |
| 136 | ' Tag quantities: as many digits as we can still grab. |
| 137 | SuperbReplace(ThisComponent, "([A-Z][a-z]?|[\])}])[0-9]+", "&@n@", 0) |
| 138 | |
| 139 | ' Make quantities. |
| 140 | SuperbReplace(ThisComponent, "[0-9]+@n@", "&", -1) |
| 141 | |
| 142 | ' Clean up all markers. |
| 143 | SuperbReplace(ThisComponent, "@[gGnq]@", "", 0) |
| 144 | |
| 145 | end sub |
| 146 | |
| 147 | </script:module> |