' SuperbChemistry version 2.2 ' http://mattmccutchen.net/schem/ ' Written and maintained by Matt McCutchen <matt@mattmccutchen.net> ' ' Applies superscript and subscript formatting to chemical formulas in ' OpenOffice.org Writer documents. ' ' Rules: ' - Quantities [0-9]+ and charges [0-9]*[-+−] are recognized after an element ' symbol [A-Z][a-z]? or a closing delimiter [\])}] . Hyphens are converted ' into real minus signs. ' - A charge sign [-+−] is ignored if it is followed by a letter, digit, ' opening delimiter, or [<>] . (Charges should appear only at the end of a ' formula, and we want to avoid matching ordinary hyphens in text.) ' - When digits followed by a charge sign are recognized, the last digit ' becomes part of the charge and the remaining digits become the quantity. ' (Charges rarely have absolute value more than 9.) ' - In cases like X2-, we have to guess whether the digit is an atom/group ' quantity or a charge amount. We guess atom/group quantity if X is H (NH4+), ' O (NO3-), a halogen (SbF6-, AlCl4-, etc.), or a closing parenthesis ' (Fe(OH)2+; the group likely would not have been parenthesized unless it had ' a quantity). Otherwise we guess charge amount (Fe3+). This heuristic ' should be right most of the time. ' ' Examples: ' C12345 ==> C_{12345} ' H+ ==> H^+ ' Cl- ==> Cl^- ' Fe3+ ==> Fe^{3+} ' SO42- ==> SO_4^{2-} ' C1232+ ==> C_{123}^{2+} ' N3- ==> N^{3-} ' N|_3^- not recognized (| represents "no-width no break") ' NH4+ ==> NH_4^+ ' NO3- ==> NO_3^- ' AlCl4- => AlCl_4^- ' Fe(OH)2+ ==> Fe(OH)_2^+ ' O12 ==> O_{12} ' y4- not recognized ' x2 not recognized ' Foo2 not recognized ' TI-89 not recognized ' ' To format the current document, run the FormatDocument macro: go to Tools -> ' Macros -> Run Macro... -> My Macros -> SuperbChemistry -> Main -> ' FormatDocument -> Run. I realize that this is ugly. I tried to make the ' package install a menu item to format the document, but the resulting package ' caused OpenOffice.org to crash regularly (I didn't investigate why), so I ' abandoned that idea. Note that you can add a menu item as a user ' customization (Tools -> Customize), and I recommend it if you plan to use ' SuperbChemistry frequently. ' ' FormatDocument uses a sequence of regular expression find-and-replace ' operations since that was easy to implement and makes the rules easy to ' change. The operations appear in the undo history, so you can undo a ' formatting run by undoing the block of "Replace" entries at the top of the ' history. ' ' I would like to support formatting a selection, but the OpenOffice.org API ' does not appear to support replace-all within a selection. I could find ' within the selection and implement the replacing myself, but that is more ' work than I want to do. ' ' If SuperbChemistry makes a mistake (e.g., recognizes a "formula" that isn't ' or formats a formula incorrectly), you can correct the formatting yourself ' and prevent future runs of the macro from recognizing the offending text by ' inserting a "No-width no break" character in the middle of it. This character ' is available in the "Insert -> Formatting Mark" menu when "Tools -> Options -> ' Language Settings -> Languages -> Enhanced language support -> ' Enabled for complex text layout (CTL)" is enabled. ' ============================================================================== ' Regular expression replace in the document, ' creating superscripts if superb > 0 or subscripts if superb < 0. ' Used by FormatDocument. sub SuperbReplace(doc as object, searchStr as string, replaceStr as string, superb as integer) dim rd as object rd = doc.createReplaceDescriptor() rd.SearchCaseSensitive = true rd.SearchRegularExpression = true rd.setSearchString(searchStr) rd.setReplaceString(replaceStr) if superb <> 0 then dim replaceAttrs(1) as new com.sun.star.beans.PropertyValue replaceAttrs(0).Name = "CharEscapement" if superb > 0 then replaceAttrs(0).Value = 33 else replaceAttrs(0).Value = -9 end if replaceAttrs(1).Name = "CharEscapementHeight" replaceAttrs(1).Value = 58 rd.setReplaceAttributes(replaceAttrs) end if doc.replaceAll(rd) end sub ' Formats the current document sub FormatDocument ' Idiom: Match something and tag it on the left or right with @x@ ' for further processing. If the replacement text could use ' backreferences, this would be easier. (I think backreferences were added ' since I originally wrote this code, but I see no need to rewrite it to take ' advantage of them. - Matt 2008-10-26) ' Tag candidate charges following symbols or ), but not in compound words, etc. ' Acceptable next character. (Has to be before end of line to avoid matching @g@ tag itself.) SuperbReplace(ThisComponent, "([A-Z][a-z]?|[\])}])[0-9]*[-+−][^[({A-Za-z0-9<>]", "&@G@", 0) ' Retag in front. SuperbReplace(ThisComponent, ".@G@", "@g@&", 0) ' End of line. SuperbReplace(ThisComponent, "([A-Z][a-z]?|[\])}])[0-9]*[-+−]$", "&@g@", 0) ' Some groups grab a single following digit as a quantity rather than a charge amount. ' See detailed rationale above. SuperbReplace(ThisComponent, "(H|O|F|Cl|Br|I|\))[0-9]", "&@n@", 0) ' Real minus signs in charges. SuperbReplace(ThisComponent, "-@g@", "−@g@", 0) ' Make charges: at most one digit. SuperbReplace(ThisComponent, "[0-9]?[−+]@g@", "@q@&", 1) ' Remove the O and ) markers in case of O57. SuperbReplace(ThisComponent, "@n@", "", 0) ' Tag quantities: as many digits as we can still grab. SuperbReplace(ThisComponent, "([A-Z][a-z]?|[\])}])[0-9]+", "&@n@", 0) ' Make quantities. SuperbReplace(ThisComponent, "[0-9]+@n@", "&", -1) ' Clean up all markers. SuperbReplace(ThisComponent, "@[gGnq]@", "", 0) end sub