X-Git-Url: https://mattmccutchen.net/superbchemistry/superbchemistry.git/blobdiff_plain/3b4a7e19b216d5010dec1d93b063d51d00666b94..HEAD:/extension/SuperbChemistry/Main.xba diff --git a/extension/SuperbChemistry/Main.xba b/extension/SuperbChemistry/Main.xba index 530cf8d..52080f3 100644 --- a/extension/SuperbChemistry/Main.xba +++ b/extension/SuperbChemistry/Main.xba @@ -1,147 +1,233 @@ -' SuperbChemistry version 2.2 -' http://mattmccutchen.net/schem/ -' Written and maintained by Matt McCutchen <matt@mattmccutchen.net> -' -' Applies superscript and subscript formatting to chemical formulas in -' OpenOffice.org Writer documents. -' -' Rules: -' - Quantities [0-9]+ and charges [0-9]*[-+−] are recognized after an element -' symbol [A-Z][a-z]? or a closing delimiter [\])}] . Hyphens are converted -' into real minus signs. -' - A charge sign [-+−] is ignored if it is followed by a letter, digit, -' opening delimiter, or [<>] . (Charges should appear only at the end of a -' formula, and we want to avoid matching ordinary hyphens in text.) -' - When digits followed by a charge sign are recognized, the last digit -' becomes part of the charge and the remaining digits become the quantity. -' (Charges rarely have absolute value more than 9.) -' - In cases like X2-, we have to guess whether the digit is an atom/group -' quantity or a charge amount. We guess atom/group quantity if X is H (NH4+), -' O (NO3-), a halogen (SbF6-, AlCl4-, etc.), or a closing parenthesis -' (Fe(OH)2+; the group likely would not have been parenthesized unless it had -' a quantity). Otherwise we guess charge amount (Fe3+). This heuristic -' should be right most of the time. -' -' Examples: -' C12345 ==> C_{12345} -' H+ ==> H^+ -' Cl- ==> Cl^- -' Fe3+ ==> Fe^{3+} -' SO42- ==> SO_4^{2-} -' C1232+ ==> C_{123}^{2+} -' N3- ==> N^{3-} -' N|_3^- not recognized (| represents "no-width no break") -' NH4+ ==> NH_4^+ -' NO3- ==> NO_3^- -' AlCl4- => AlCl_4^- -' Fe(OH)2+ ==> Fe(OH)_2^+ -' O12 ==> O_{12} -' y4- not recognized -' x2 not recognized -' Foo2 not recognized -' TI-89 not recognized -' -' To format the current document, run the FormatDocument macro: go to Tools -> -' Macros -> Run Macro... -> My Macros -> SuperbChemistry -> Main -> -' FormatDocument -> Run. I realize that this is ugly. I tried to make the -' package install a menu item to format the document, but the resulting package -' caused OpenOffice.org to crash regularly (I didn't investigate why), so I -' abandoned that idea. Note that you can add a menu item as a user -' customization (Tools -> Customize), and I recommend it if you plan to use -' SuperbChemistry frequently. -' -' FormatDocument uses a sequence of regular expression find-and-replace -' operations since that was easy to implement and makes the rules easy to -' change. The operations appear in the undo history, so you can undo a -' formatting run by undoing the block of "Replace" entries at the top of the -' history. -' -' I would like to support formatting a selection, but the OpenOffice.org API -' does not appear to support replace-all within a selection. I could find -' within the selection and implement the replacing myself, but that is more -' work than I want to do. -' -' If SuperbChemistry makes a mistake (e.g., recognizes a "formula" that isn't -' or formats a formula incorrectly), you can correct the formatting yourself -' and prevent future runs of the macro from recognizing the offending text by -' inserting a "No-width no break" character in the middle of it. This character -' is available in the "Insert -> Formatting Mark" menu when "Tools -> Options -> -' Language Settings -> Languages -> Enhanced language support -> -' Enabled for complex text layout (CTL)" is enabled. - -' ============================================================================== - -' Regular expression replace in the document, -' creating superscripts if superb > 0 or subscripts if superb < 0. -' Used by FormatDocument. -sub SuperbReplace(doc as object, searchStr as string, replaceStr as string, superb as integer) - -dim rd as object -rd = doc.createReplaceDescriptor() - -rd.SearchCaseSensitive = true -rd.SearchRegularExpression = true -rd.setSearchString(searchStr) -rd.setReplaceString(replaceStr) - -if superb <> 0 then - dim replaceAttrs(1) as new com.sun.star.beans.PropertyValue - replaceAttrs(0).Name = "CharEscapement" - if superb > 0 then - replaceAttrs(0).Value = 33 - else - replaceAttrs(0).Value = -9 - end if - replaceAttrs(1).Name = "CharEscapementHeight" - replaceAttrs(1).Value = 58 - rd.setReplaceAttributes(replaceAttrs) -end if - -doc.replaceAll(rd) - -end sub - -' Formats the current document -sub FormatDocument - -' Idiom: Match something and tag it on the left or right with @x@ -' for further processing. If the replacement text could use -' backreferences, this would be easier. (I think backreferences were added -' since I originally wrote this code, but I see no need to rewrite it to take -' advantage of them. - Matt 2008-10-26) - -' Tag candidate charges following symbols or ), but not in compound words, etc. -' Acceptable next character. (Has to be before end of line to avoid matching @g@ tag itself.) -SuperbReplace(ThisComponent, "([A-Z][a-z]?|[\])}])[0-9]*[-+−][^[({A-Za-z0-9<>]", "&@G@", 0) -' Retag in front. -SuperbReplace(ThisComponent, ".@G@", "@g@&", 0) -' End of line. -SuperbReplace(ThisComponent, "([A-Z][a-z]?|[\])}])[0-9]*[-+−]$", "&@g@", 0) - -' Some groups grab a single following digit as a quantity rather than a charge amount. -' See detailed rationale above. -SuperbReplace(ThisComponent, "(H|O|F|Cl|Br|I|\))[0-9]", "&@n@", 0) - -' Real minus signs in charges. -SuperbReplace(ThisComponent, "-@g@", "−@g@", 0) - -' Make charges: at most one digit. -SuperbReplace(ThisComponent, "[0-9]?[−+]@g@", "@q@&", 1) - -' Remove the O and ) markers in case of O57. -SuperbReplace(ThisComponent, "@n@", "", 0) - -' Tag quantities: as many digits as we can still grab. -SuperbReplace(ThisComponent, "([A-Z][a-z]?|[\])}])[0-9]+", "&@n@", 0) - -' Make quantities. -SuperbReplace(ThisComponent, "[0-9]+@n@", "&", -1) - -' Clean up all markers. -SuperbReplace(ThisComponent, "@[gGnq]@", "", 0) - -end sub +Option Explicit + +Function HaveSelection(doc As Object) + Dim sel as Object + sel = doc.CurrentController.Selection + If sel.Count > 1 Then + HaveSelection = True + Exit Function + End If + Dim s0 + s0 = sel.getByIndex(0) + If s0.Text.compareRegionStarts(s0.Start, s0.End) = 0 Then + HaveSelection = False + Else + HaveSelection = True + End If +End Function + +' Regular expression replace in the document, creating superscripts if +' superb > 0 or subscripts if superb < 0. +Sub SuperbReplace(doc As Object, searchRegex As String, replacePattern As String, superb As Integer) + + Dim rd As Object + rd = doc.createReplaceDescriptor() + + rd.SearchCaseSensitive = true + rd.SearchRegularExpression = true + rd.setSearchString(searchRegex) + rd.setReplaceString(replacePattern) + + If superb <> 0 Then + Dim replaceAttrs(1) As New com.sun.star.beans.PropertyValue + replaceAttrs(0).Name = "CharEscapement" + If superb > 0 Then + replaceAttrs(0).Value = 33 + Else + ' The default escapement for subscripts is -33, which looked bad to me in + ' chemical formulas. This looks better. Modify to your taste. + replaceAttrs(0).Value = -9 + End If + replaceAttrs(1).Name = "CharEscapementHeight" + replaceAttrs(1).Value = 58 + rd.setReplaceAttributes(replaceAttrs) + End If + + doc.replaceAll(rd) + +End Sub + +Sub ReplaceInSelection(doc As Object, searchRegex As String, replacePattern As String) + + Dim frame As Object, dispatcher As Object + frame = doc.CurrentController.Frame + dispatcher = createUnoService("com.sun.star.frame.DispatchHelper") + + Dim args(6) As New com.sun.star.beans.PropertyValue + args(0).Name = "SearchItem.AlgorithmType" + args(0).Value = 1 + args(1).Name = "SearchItem.SearchFlags" + args(1).Value = &H1800 ' Search in selection + args(2).Name = "SearchItem.SearchString" + args(2).Value = searchRegex + args(3).Name = "SearchItem.ReplaceString" + args(3).Value = replacePattern + args(4).Name = "SearchItem.Command" + args(4).Value = 3 + args(5).Name = "SearchItem.AlgorithmType2" + args(5).Value = 2 + args(6).Name = "Quiet" + args(6).Value = true + + dispatcher.executeDispatch(frame, ".uno:ExecuteSearch", "", 0, args()) + +End Sub + +Global formatSelectionWarningShown As Boolean +' I haven't found a way to initialize the variable, but it looks like its +' default value is treated as false in an if statement. +'formatSelectionWarningShown = False + +Sub FormatSelectionOrDocumentDebug() + ' Replacing with an empty replacement string triggers a bug in LibreOffice + ' (https://bugs.documentfoundation.org/show_bug.cgi?id=136577), so we must + ' avoid it. Fortunately, avoiding it is pretty straightforward. + + ' Step 1: Insert @m@ between an item and a number or charge that may be part of + ' a chemical formula (subject to later checks). + If HaveSelection(ThisComponent) Then + ' doc.replaceAll is not capable of searching a selection, while the + ' dispatch-based replace API uses the current format options in the + ' "Find & Replace" dialog and does not let us change them, which means that + ' (1) existing settings can break things and (2) we cannot make superscripts + ' and subscripts. We do the best we can by using one dispatch-based replace + ' to tag all the sequences we want to operate on (which loses the selection in + ' OpenOffice anyway) and then proceed with replaceAll calls on the whole + ' document, which will only operate on the already tagged sequences. + If Not formatSelectionWarningShown Then + MsgBox "Due to limitations in the OpenOffice/LibreOffice API, the " & _ + """Format selection"" command may not process some chemical formulas " & _ + "in the selection or may apply incorrect formatting if any format " & _ + "options are active in the ""Find & Replace"" dialog box. If this " & _ + "happens, just undo the command (if any changes were made), clear the " & _ + "format options in the ""Find & Replace"" dialog box (focus the " & _ + """Find"" field, click ""No Format"", and repeat for the ""Replace"" " & _ + "field), and run ""Format selection"" again." & Chr$(13) & Chr$(13) & _ + "This message is always shown on the first ""Format selection"" " & _ + "command in each OpenOffice/LibreOffice session because " & _ + "SuperbChemistry has no way to detect whether format options are " & _ + "active in ""Find & Replace"".", _ + 0, "SuperbChemistry ""Format selection"" notice" + formatSelectionWarningShown = True + End If + ReplaceInSelection(ThisComponent, "(?<=[A-Z][a-z]?|[\])}])[-+−0-9]+", "@m@&") + Else + SuperbReplace(ThisComponent, "(?<=[A-Z][a-z]?|[\])}])[-+−0-9]+", "@m@&", 0) + End If + + ' Step 2: Insert @c@ after a charge symbol, if it's followed by one of the + ' allowed characters for the second kind of "recognized sequence" described in + ' the readme. + SuperbReplace(ThisComponent, "(?<=@m@)([0-9]*[-+−])(?=[ \t\])}.,:;?!'""]|$)", "&@c@", 0) + + ' Step 3: Real minus signs in charges. + SuperbReplace(ThisComponent, "-@c@", "−@c@", 0) + + ' Step 4: Some groups grab a single following digit as a quantity rather than a + ' charge amount. Insert @sq@ marker to prevent the charge from grabbing the + ' digit. + SuperbReplace(ThisComponent, "(?<=(H|O|F|Cl|Br|I|[\])}])@m@)[0-9]", "&@sq@", 0) + + ' Step 5: Each charge grabs at most one digit and moves the @c@ in front to + ' prevent the quantity from grabbing the digit. + SuperbReplace(ThisComponent, "([0-9]?[−+])@c@", "@c@$1", 1) + + ' Step 6: Remove any @sq@ markers so items can grab all the digits that follow + ' for the quantity. + SuperbReplace(ThisComponent, "(.)@sq@", "$1", 0) + + ' At this point, we have only @m@ and @c@ markers left. + + ' Step 7: Format quantities: as many digits as we can still grab. The digits + ' must be followed by one of the allowed characters for the first kind of + ' "recognized sequence" described by a readme or by @, which we assume is part + ' of a @c@ tag we added in step 2. The allowed characters A-Z\[({ represent + ' the beginning of another item; the other allowed characters are the same as + ' in step 2. + SuperbReplace(ThisComponent, "(?<=@m@)[0-9]+(?=[@A-Z\[({ \t\])}.,:;?!'""]|$)", "&", -1) + + ' Step 8: Clean up @c@ markers. We know there is a charge sign after each. + SuperbReplace(ThisComponent, "@c@(.)", "$1", 0) + + ' Step 9: Clean up @m@ markers. We know there is some character before each. + SuperbReplace(ThisComponent, "(.)@m@", "$1", 0) + +End Sub + +Dim madeChanges As Boolean + +Sub UndoListener_undoActionAdded() +End Sub +Sub UndoListener_actionUndone() +End Sub +Sub UndoListener_actionRedone() +End Sub +Sub UndoListener_allActionsCleared() +End Sub +Sub UndoListener_redoActionsCleared() +End Sub +Sub UndoListener_resetAll() +End Sub +Sub UndoListener_enteredContext() +End Sub +Sub UndoListener_enteredHiddenContext() +End Sub +Sub UndoListener_leftContext() + madeChanges = True +End Sub +Sub UndoListener_leftHiddenContext() +End Sub +Sub UndoListener_cancelledContext() +End Sub + +Sub FormatSelectionOrDocument() + + Dim undoActionName As String + If HaveSelection(ThisComponent) Then + undoActionName = "SuperbChemistry: Format selection" + Else + undoActionName = "SuperbChemistry: Format document" + End If + ThisComponent.UndoManager.enterUndoContext(undoActionName) + + On Error Goto ErrorHandler + FormatSelectionOrDocumentDebug + On Error Goto 0 + + ThisComponent.UndoManager.leaveUndoContext() + Exit Sub + +ErrorHandler: + + ' If our undo context is nonempty, we want to undo the generated action. + ' If not, we do not want to undo as that would undo the user's previous + ' action. If we just check whether the title of the last undoable action + ' is "SuperbChemistry: Format (selection|document)", that might be wrong if + ' the user ran FormatDocument twice in a row: probably unlikely, but the + ' completely correct check is not that hard. + Dim listener As Object + listener = CreateUnoListener("UndoListener_", "com.sun.star.document.XUndoManagerListener") + madeChanges = False + ThisComponent.UndoManager.addUndoManagerListener(listener) + ThisComponent.UndoManager.leaveUndoContext() + ThisComponent.UndoManager.removeUndoManagerListener(listener) + If madeChanges Then + ThisComponent.UndoManager.undo() + ThisComponent.UndoManager.clearRedo() + End If + + MsgBox "SuperbChemistry encountered an unexpected error:" & Chr$(13) & Chr$(13) & _ + "Code " & Err & ": " & Error$ & Chr$(13) & Chr$(13) & _ + "Any changes made so far have been undone." & Chr$(13) & Chr$(13) & _ + "SuperbChemistry needed to catch the error in order to leave your undo " & _ + "history in a consistent state. If the problem is reproducible and you " & _ + "want to see the precise error location, run the FormatSelectionOrDocumentDebug " & _ + "macro, but be advised that it may generate multiple entries in the undo " & _ + "history and will not undo them on error.", _ + 0, "SuperbChemistry internal error" + On Error Goto 0 +End Sub \ No newline at end of file