//---------------------------------------------------------------------------------- //testfile = new File('/usr/local/widgets/data') // unix testfile = new File('Pleac/data/blue.txt') // windows testfile.eachLine{ if (it =~ /blue/) println it } // Groovy (like Java) uses the File class as an abstraction for // the path representing a potential file system resource. // Channels and Streams (along with Reader adn Writer helper // classes) are used to read and write to files (and other // things). Files, channels, streams etc are all "normal" // objects; they can be passed around in your programs just // like other objects (though there are some restrictions // covered elsewhere - e.g. you can't expect to pass a File // object between JVMs on different machines running different // operating systems and expect them to maintain a meaningful // value across the different JVMs). In addition to Streams, // there is also support for random access to files. // Many operations are available on streams and channels. Some // return values to indicate success or failure, some can throw // exceptions, other times both styles of error reporting may be // available. // Streams at the lowest level are just a sequence of bytes though // there are various abstractions at higher levels to allow // interacting with streams at encoded character, data type or // object levels if desired. Standard streams include System.in, // System.out and System.err. Java and Groovy on top of that // provide facilities for buffering, filtering and processing // streams in various ways. // File channels provide more powerful operations than streams // for reading and writing files such as locks, buffering, // positioning, concurrent reading and writing, mapping to memory // etc. In the examples which follow, streams will be used for // simple cases, channels when more advanced features are // required. Groovy currently focusses on providing extra support // at the file and stream level rather than channel level. // This makes the simple things easy but lets you do more complex // things by just using the appropriate Java classes. All Java // classes are available within Groovy by default. // Groovy provides syntactic sugar over the top of Java's file // processing capabilities by providing meaning to shorthand // operators and by automatically handling scaffolding type // code such as opening, closing and handling exceptions behind // the scenes. It also provides many powerful closure operators, // e.g. file.eachLineMatch(pattern){ some_operation } will open // the file, process it line-by-line, finding all lines which // match the specified pattern and then invoke some operation // for the matching line(s) if any, before closing the file. // this example shows how to access the standard input stream // numericCheckingScript: prompt = '\n> ' print 'Enter text including a digit:' + prompt new BufferedReader(new InputStreamReader(System.in)).eachLine{ line -> // line is read from System.in if (line =~ '\\d') println "Read: $line" // normal output to System.out else System.err.println 'No digit found.' // this message to System.err } //---------------------------------------------------------------------------------- |
//---------------------------------------------------------------------------------- // test values (change for your os and directories) inputPath='Pleac/src/pleac7.groovy'; outPath='Pleac/temp/junk.txt' // For input Java uses InputStreams (for byte-oriented processing) or Readers // (for character-oriented processing). These can throw FileNotFoundException. // There are also other stream variants: buffered, data, filters, objects, ... inputFile = new File(inputPath) inputStream = new FileInputStream(inputFile) reader = new FileReader(inputFile) inputChannel = inputStream.channel // Examples for random access to a file file = new RandomAccessFile(inputFile, "rw") // for read and write channel = file.channel // Groovy provides some sugar coating on top of Java println inputFile.text.size() // => 13496 // For output Java use OutputStreams or Writers. Can throw FileNotFound // or IO exceptions. There are also other flavours of stream: buffered, // data, filters, objects, ... outFile = new File(outPath) appendFlag = false outStream = new FileOutputStream(outFile, appendFlag) writer = new FileWriter(outFile, appendFlag) outChannel = outStream.channel // Also some Groovy sugar coating outFile << 'A Chinese sailing vessel' println outFile.text.size() // => 24 |
//---------------------------------------------------------------------------------- // No problem with Groovy since the filename doesn't contain characters with // special meaning; like Perl's sysopen. Options are either additional parameters // or captured in different classes, e.g. Input vs Output, Buffered vs non etc. new FileReader(inputPath) //---------------------------------------------------------------------------------- |
//---------------------------------------------------------------------------------- // '~' is a shell expansion feature rather than file system feature per se. // Because '~' is a valid filename character in some operating systems, and Java // attempts to be cross-platform, it doesn't automatically expand Tilde's. // Given that '~' expansion is commonly used however, Java puts the $HOME // environment variable (used by shells to do typical expansion) into the // "user.home" system property. This works across operating systems - though // the value inside differs from system to system so you shouldn't rely on its // content to be of a particular format. In most cases though you should be // able to write a regex that will work as expected. Also, Apple's // NSPathUtilities can expand and introduce Tildes on platforms it supports. path = '~paulk/.cvspass' name = System.getProperty('user.name') home = System.getProperty('user.home') println home + path.replaceAll("~$name(.*)", '$1') // => C:\Documents and Settings\Paul/.cvspass //---------------------------------------------------------------------------------- |
//---------------------------------------------------------------------------------- // The exception raised in Groovy reports the filename try { new File('unknown_path/bad_file.ext').text } catch (Exception ex) { System.err.println(ex.message) } // => // unknown_path\bad_file.ext (The system cannot find the path specified) //---------------------------------------------------------------------------------- |
//---------------------------------------------------------------------------------- try { temp = File.createTempFile("prefix", ".suffix") temp.deleteOnExit() } catch (IOException ex) { System.err.println("Temp file could not be created") } //---------------------------------------------------------------------------------- |
//---------------------------------------------------------------------------------- // no special features are provided, here is a way to do it manually // DO NOT REMOVE THE FOLLOWING STRING DEFINITION. pleac_7_6_embeddedFileInfo = ''' Script size is 13731 Last script update: Wed Jan 10 19:05:58 EST 2007 ''' ls = System.getProperty('line.separator') file = new File('Pleac/src/pleac7.groovy') regex = /(?ms)(?<=^pleac_7_6_embeddedFileInfo = ''')(.*)(?=^''')/ def readEmbeddedInfo() { m = file.text =~ regex println 'Found:\n' + m[0][1] } def writeEmbeddedInfo() { lastMod = new Date(file.lastModified()) newInfo = "${ls}Script size is ${file.size()}${ls}Last script update: ${lastMod}${ls}" file.write(file.text.replaceAll(regex, newInfo)) } readEmbeddedInfo() // writeEmbeddedInfo() // uncomment to make script update itself // readEmbeddedInfo() // uncomment to redisplay the embedded info after the update // => (output when above two method call lines are uncommented) // Found: // // Script size is 13550 // Last script update: Wed Jan 10 18:56:03 EST 2007 // // Found: // // Script size is 13731 // Last script update: Wed Jan 10 19:05:58 EST 2007 //---------------------------------------------------------------------------------- |
//---------------------------------------------------------------------------------- // general pattern for reading from System.in is: // System.in.readLines().each{ processLine(it) } // general pattern for a filter which can either process file args or read from System.in is: // if (args.size() != 0) args.each{ // file -> new File(file).eachLine{ processLine(it) } // } else System.in.readLines().each{ processLine(it) } // note: the following examples are file-related per se. They show // how to do option processing in scenarios which typically also // involve file arguments. The reader should also consider using a // pre-packaged options parser package (there are several popular // ones) rather than the hard-coded processing examples shown here. chopFirst = false columns = 0 args = ['-c', '-30', 'somefile'] // demo1: optional c if (args[0] == '-c') { chopFirst = true args = args[1..-1] } assert args == ["-30", "somefile"] assert chopFirst // demo2: processing numerical options if (args[0] =~ /^-(\d+)$/) { columns = args[0][1..-1].toInteger() args = args[1..-1] } assert args == ["somefile"] assert columns == 30 // demo3: multiple args (again consider option parsing package) args = ['-n','-a','file1','file2'] nostdout = false append = false unbuffer = false ignore_ints = false files = [] args.each{ arg -> switch(arg) { case '-n': nostdout = true; break case '-a': append = true; break case '-u': unbuffer = true; break case '-i': ignore_ints = true; break default: files += arg } } if (files.any{ it.startsWith('-')}) { System.err.println("usage: demo3 [-ainu] [filenames]") } // process files ... assert nostdout && append && !unbuffer && !ignore_ints assert files == ['file1','file2'] // find login: print all lines containing the string "login" (command-line version) //% groovy -ne "if (line =~ 'login') println line" filename // find login variation: lines containing "login" with line number (command-line version) //% groovy -ne "if (line =~ 'login') println count + ':' + line" filename // lowercase file (command-line version) //% groovy -pe "line.toLowerCase()" // count chunks but skip comments and stop when reaching "__DATA__" or "__END__" chunks = 0; done = false testfile = new File('Pleac/data/chunks.txt') // change on your system lines = testfile.readLines() for (line in lines) { if (!line.trim()) continue words = line.split(/[^\w#]+/).toList() for (word in words) { if (word =~ /^#/) break if (word in ["__DATA__", "__END__"]) { done = true; break } chunks += 1 } if (done) break } println "Found $chunks chunks" // groovy "one-liner" (cough cough) for turning .history file into pretty version: //% groovy -e "m=new File(args[0]).text=~/(?ms)^#\+(\d+)\r?\n(.*?)$/;(0..<m.count).each{println ''+new Date(m[it][1].toInteger())+' '+m[it][2]}" .history // => // Sun Jan 11 18:26:22 EST 1970 less /etc/motd // Sun Jan 11 18:26:22 EST 1970 vi ~/.exrc // Sun Jan 11 18:26:22 EST 1970 date // Sun Jan 11 18:26:22 EST 1970 who // Sun Jan 11 18:26:22 EST 1970 telnet home //---------------------------------------------------------------------------------- |
//---------------------------------------------------------------------------------- // test data for below testPath = 'Pleac/data/process.txt' // general pattern def processWithBackup(inputPath, Closure processLine) { def input = new File(inputPath) def out = File.createTempFile("prefix", ".suffix") out.write('') // create empty file count = 0 input.eachLine{ line -> count++ processLine(out, line, count) } def dest = new File(inputPath + ".orig") dest.delete() // clobber previous backup input.renameTo(dest) out.renameTo(input) } // use withPrintWriter if you don't want the '\n''s appearing processWithBackup(testPath) { out, line, count -> if (count == 20) { // we are at the 20th line out << "Extra line 1\n" out << "Extra line 2\n" } out << line + '\n' } processWithBackup(testPath) { out, line, count -> if (!(count in 20..30)) // skip the 20th line to the 30th out << line + '\n' } // equivalent to "one-liner": //% groovy -i.orig -pe "if (!(count in 20..30)) out << line" testPath //---------------------------------------------------------------------------------- |
//---------------------------------------------------------------------------------- //% groovy -i.orig -pe 'FILTER COMMAND' file1 file2 file3 ... // the following may also be possible on unix systems (unchecked) //#!/usr/bin/groovy -i.orig -p // filter commands go here // "one-liner" templating scenario: change DATE -> current time //% groovy -pi.orig -e 'line.replaceAll(/DATE/){new Date()}' //% groovy -i.old -pe 'line.replaceAll(/\bhisvar\b/, 'hervar')' *.[Cchy] (globbing platform specific) // one-liner for correcting spelling typos //% groovy -i.orig -pe 'line.replaceAll(/\b(p)earl\b/i, '\1erl')' *.[Cchy] (globbing platform specific) //---------------------------------------------------------------------------------- |
//---------------------------------------------------------------------------------- // general pattern def processFileInplace(file, Closure processText) { def text = file.text file.write(processText(text)) } // templating scenario: change DATE -> current time testfile = new File('Pleac/data/pleac7_10.txt') // replace on your system processFileInplace(testfile) { text -> text.replaceAll(/(?m)DATE/, new Date().toString()) } //---------------------------------------------------------------------------------- |
//---------------------------------------------------------------------------------- // You need to use Java's Channel class to acquire locks. The exact // nature of the lock is somewhat dependent on the operating system. def processFileWithLock(file, processStream) { def random = new RandomAccessFile(file, "rw") def lock = random.channel.lock() // acquire exclusive lock processStream(random) lock.release() random.close() } // Instead of an exclusive lock you can acquire a shared lock. // Also, you can acquire a lock for a region of a file by specifying // start and end positions of the region when acquiring the lock. // For non-blocking functionality, use tryLock() instead of lock(). def processFileWithTryLock(file, processStream) { random = new RandomAccessFile(file, "rw") channel = random.channel def MAX_ATTEMPTS = 30 for (i in 0..<MAX_ATTEMPTS) { lock = channel.tryLock() if (lock != null) break println 'Could not get lock, pausing ...' Thread.sleep(500) // 500 millis = 0.5 secs } if (lock == null) { println 'Unable to acquire lock, aborting ...' } else { processStream(random) lock.release() } random.close() } // non-blocking multithreaded example: print first line while holding lock Thread.start{ processFileWithLock(testfile) { source -> println 'First reader: ' + source.readLine().toUpperCase() Thread.sleep(2000) // 2000 millis = 2 secs } } processFileWithTryLock(testfile) { source -> println 'Second reader: ' + source.readLine().toUpperCase() } // => // Could not get lock, pausing ... // First reader: WAS LOWERCASE // Could not get lock, pausing ... // Could not get lock, pausing ... // Could not get lock, pausing ... // Could not get lock, pausing ... // Second reader: WAS LOWERCASE //---------------------------------------------------------------------------------- |
//---------------------------------------------------------------------------------- // In Java, input and output streams have a flush() method and file channels // have a force() method (applicable also to memory-mapped files). When creating // PrintWriters and // PrintStreams, an autoFlush option can be provided. // From a FileInput or Output Stream you can ask for the FileDescriptor // which has a sync() method - but you wouldn't you'd just use flush(). inputStream = testfile.newInputStream() // returns a buffered input stream autoFlush = true printStream = new PrintStream(outStream, autoFlush) printWriter = new PrintWriter(outStream, autoFlush) //---------------------------------------------------------------------------------- |
//----------------------------------------------------------------------------------
// See the comments in 7.14 about scenarios where non-blocking can be
// avoided. Also see 7.14 regarding basic information about channels.
// An advanced feature of the java.nio.channels package is supported
// by the Selector and SelectableChannel classes. These allow efficient
// server multiplexing amongst responses from a number of potential sources.
// Under the covers, it allows mapping to native operating system features
// supporting such multiplexing or using a pool of worker processing threads
// much smaller in size than the total available connections.
//
// The general pattern for using selectors is:
//
// while (true) {
// selector.select()
// def it = selector.selectedKeys().iterator()
// while (it.hasNext()) {
// handleKey(it++)
// it.remove()
// }
// }
//----------------------------------------------------------------------------------
|
//---------------------------------------------------------------------------------- // Groovy has no special support for this apart from making it easier to // create threads (see note at end); it relies on Java's features here. // InputStreams in Java/Groovy block if input is not yet available. // This is not normally an issue, because if you have a potential blocking // operation, e.g. save a large file, you normally just create a thread // and save it in the background. // Channels are one way to do non-blocking stream-based IO. // Classes which implement the AbstractSelectableChannel interface provide // a configureBlocking(boolean) method as well as an isBlocking() method. // When processing a non-blocking stream, you need to process incoming // information based on the number of bytes read returned by the various // read methods. For non-blocking, this can be 0 bytes even if you pass // a fixed size byte[] buffer to the read method. Non-blocking IO is typically // not used with Files but more normally with network streams though they // can when Pipes (couple sink and source channels) are involved where // one side of the pipe is a file. //---------------------------------------------------------------------------------- |
//----------------------------------------------------------------------------------
// Groovy uses Java's features here.
// For both blocking and non-blocking reads, the read operation returns the number
// of bytes read. In blocking operations, this normally corresponds to the number
// of bytes requested (typically the size of some buffer) but can have a smaller
// value at the end of a stream. Java also makes no guarantees about whether
// other streams in general will return bytes as they become available under
// certain circumstances (rather than blocking until the entire buffer is filled.
// In non-blocking operations, the number of bytes returned will typically be
// the number of bytes available (up to some maximum buffer or requested size).
//----------------------------------------------------------------------------------
|
//----------------------------------------------------------------------------------
// This just works in Java and Groovy as per the previous examples.
//----------------------------------------------------------------------------------
|
//----------------------------------------------------------------------------------
// Groovy uses Java's features here.
// More work has been done in the Java on object caching than file caching
// with several open source and commercial offerings in that area. File caches
// are also available, for one, see:
// http://portals.apache.org/jetspeed-1/apidocs/org/apache/jetspeed/cache/FileCache.html
//----------------------------------------------------------------------------------
|
//----------------------------------------------------------------------------------
// The general pattern is: streams.each{ stream -> stream.println 'item to print' }
// See the MultiStream example in 13.5 for a coded example.
//----------------------------------------------------------------------------------
|
//----------------------------------------------------------------------------------
// You wouldn't normally be dealing with FileDescriptors. In case were you have
// one you would normally walk through all known FileStreams asking each for
// it's FileDescriptor until you found one that matched. You would then close
// that stream.
//----------------------------------------------------------------------------------
|
//---------------------------------------------------------------------------------- // There are several concepts here. At the object level, any two object references // can point to the same object. Any changes made by one of these will be visible // in the 'alias'. You can also have multiple stream, reader, writer or channel objects // referencing the same resource. Depending on the kind of resource, any potential // locks, the operations being requested and the behaviour of third-party programs, // the result of trying to perform such concurrent operations may not always be // deterministic. There are strategies for coping with such scenarious but the // best bet is to avoid the issue. // For the scenario given, copying file handles, that corresponds most closely // with cloning streams. The best bet is to just use individual stream objects // both created from the same file. If you are attempting to do write operations, // then you should consider using locks. //---------------------------------------------------------------------------------- |
//----------------------------------------------------------------------------------
// locking is built in to Java (since 1.4), so should not be missing
//----------------------------------------------------------------------------------
|
//----------------------------------------------------------------------------------
// Java locking supports locking just regions of files.
//----------------------------------------------------------------------------------
|