[Koha-devel] [bug 6458]: need some help for a parsing problem in 'misc/translator/TTParser.pm'

Chris Cormack chrisc at catalyst.net.nz
Wed Jul 20 12:02:56 CEST 2011


* Frère Sébastien Marie (semarie-koha at latrappe.fr) wrote:
> On Wed, Jul 20, 2011 at 06:42:24AM +1200, Chris Cormack wrote:
> > Hi
> > 
> > Adding tests to xt/ to catch these would be awesome too, then it would stop
> > more creeping in.
> > 
> > I could add it as a step for jenkins, so if you wanted to share the parser
> > that would be excellent, I have a koha-qa git repo we could add it to.
> 
> As said, the parser is in haskell language (exactly, ni literate haskell).
> 
> So before including in t/ , it should be refactoring (I have tried, but my knowledge in perl is too bad, even for creating a simple stream parser).
> 
It would go in xt/ if it was to become part of the main repo, but I
was proposing putting it in the koha-qa repo, so it can go in there in
Haskell just fine.

Someone can refactor it into perl at their leisure, but that doesnt
stop us setting up jenkins to run it as part of it's tests in the mean
time

> The haskell parser is just a program which take on STDIN a file content, and output:
>  - nothing : no problem found
>  - numbers separated by space : lines number where DIRECTIVE are found in TAG
>  - an error message in parentheses
> 
> For generate the log file, a used a couple of shell script (a wrapper around 'find', and a wrapper around the parser).

Thanks for attaching it, with your permission I'll add it to
http://gitorious.org/koha-qa-tools for now

Chris

> -- 
> Frère Sébastien Marie
> Abbaye Notre Dame de La Trappe
> 61380 Soligny-la-Trappe
> Tél: 02.33.84.17.00
> Fax: 02.33.34.98.57
> Web: http://www.latrappe.fr/

> This program is written in Literate Haskell.
> All lines starting with '> ' are Haskell, others are comments.
> 
> Haskell is a lazy functional and pure language.
> 
> This program is not very optimized... it just work.
> 
> This program should be used as pipe:
>  - STDIN: template to check
>  - STDOUT:
>     - nothing: no problem found
>     - numbers separated by space : lines number where DIRECTIVE are found in TAG
>     - an error message in parentheses (generally when EOF in found inner not toplevel context)
> 
> 
> First, the haskell module definition (mandatory).
> 
> > module Main where
>  
> Next, we create the entrypoint of the program: the 'main' funtion.
> 
> The function use 'interact' for interfacing with STDIN/STDOUT.
> The argument of 'interact' is a function which take a list of Char (type String) and return a list of Char.
> 
> We start in the INITIAL context, with line number set to 1
> 
> > main :: IO ()
> > main = interact $ inINITIALContext 1
> 
> All the context have the form:
>  an Integer argument : the line number position
>  a list of Char (the String) : the current position of STDIN
> 
> and return
>  a list of Char (type String) : STDOUT output
> 
> 
> Functions are composed of:
>  - function signature (not mandatory [else infered], but good practice)
>  - the body, using Pattern Matching
> 
> In the INITIAL context:
>  - increment 'line' on newline
>  - if start with '<!--', go in COMMENT context
>  - if start with '<![CDATA[', go in CDATA context
>  - if start with '<', go in TAG context
>  - else, discard char
>  - EOF in INITIAL: so end
> 
> > inINITIALContext :: Integer -> String -> String
> 
> > inINITIALContext line ('\n':xs) = inINITIALContext (line+1) xs
> > inINITIALContext line ('<':'!':'-':'-':xs) = inCOMMENTContext line xs
> > inINITIALContext line ('<':'!':'[':'C':'D':'A':'T':'A':'[':xs) = inCDATAContext line xs
> > inINITIALContext line ('<':xs)  = inTAGContext line xs
> > inINITIALContext line (_x:xs)   = inINITIALContext line xs
> > inINITIALContext _line [] = ""
> 
> 
> In the COMMENT context:
>  - increment 'line' on newline
>  - go in INITIAL at end of comment '-->'
>  - discard char else
>  - EOF in COMMENT
> 
> > inCOMMENTContext :: Integer -> String -> String
> 
> > inCOMMENTContext line ('\n':xs)        = inCOMMENTContext (line+1) xs
> > inCOMMENTContext line ('-':'-':'>':xs) = inINITIALContext line xs
> > inCOMMENTContext line (_x:xs) = inCOMMENTContext line xs
> > inCOMMENTContext _line []     = "(EOF in COMMENT)"
> 
> In the CDATA context:
>  - increment 'line' on newline
>  - go in INITIAL at end of cdata ']]>'
>  - discard char else
>  - EOF in CDATA
> 
> 
> > inCDATAContext :: Integer -> String -> String
> 
> > inCDATAContext line ('\n':xs) = inCDATAContext (line+1) xs
> > inCDATAContext line (']':']':'>':xs) = inINITIALContext line xs
> > inCDATAContext line (_xs:xs)  = inCDATAContext line xs
> > inCDATAContext _line []       = "(EOF in CDATA)"
> 
> 
> In the TAG context:
>  - increment 'line' on newline
>  - go in STRING if '"' found
>  - go back in INITIAL if '>'
>  - found a DIRECTIVE TEMPLATE in TAG if '[%': output the line (with conversion Integer -> String using 'show'), append space, append the rest (keep in TAG context, continue parsing)
>  - EOF in TAG
> 
> > inTAGContext :: Integer -> String -> String
> 
> > inTAGContext line ('\n':xs)    = inTAGContext (line+1) xs
> > inTAGContext line ('"':xs)     = inSTRINGContext line xs
> > inTAGContext line ('>':xs)     = inINITIALContext line xs
> > inTAGContext line ('[':'%':xs) = show line ++ " " ++ inTAGContext line xs
> > inTAGContext line (_x:xs)      = inTAGContext line xs
> > inTAGContext _line []          = "(EOF in TAG)"
> 
> 
> In the STRING context:
>  - increment 'line' on newline
>  - skip \" (keep STRING context)
>  - go in TAG context at end of string
>  - discard else
>  - EOF in STRING context
> 
> > inSTRINGContext :: Integer -> String -> String
> 
> > inSTRINGContext line ('\n':xs)     = inSTRINGContext (line+1) xs
> > inSTRINGContext line ('\\':'"':xs) = inSTRINGContext line xs
> > inSTRINGContext line ('"':xs)      = inTAGContext line xs
> > inSTRINGContext line (_x:xs)       = inSTRINGContext line xs
> > inSTRINGContext _line []           = "(EOF in STRING)"
> 

> _______________________________________________
> Koha-devel mailing list
> Koha-devel at lists.koha-community.org
> http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> website : http://www.koha-community.org/
> git : http://git.koha-community.org/
> bugs : http://bugs.koha-community.org/


-- 
Chris Cormack
Catalyst IT Ltd.
+64 4 803 2238
PO Box 11-053, Manners St, Wellington 6142, New Zealand
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: </pipermail/koha-devel/attachments/20110720/4a2925a8/attachment.pgp>


More information about the Koha-devel mailing list