|
||||||
|
AVS TechnoteSimulating "Smart Readers" for "Large Data" Using Existing AVS ToolsThe Large Data Problem is characterized as the problem of having datasets on disk which are too big to load into machine memory for visual processing. Currently, AVS users load in huge disk files and use downsize, crop, or extract scalar to get the data down to a reasonable size for further processing. This has the inherent problem of having to read in the whole data set (and use up the corresponding memory) prior to reducing the size of the data set. Smart Readers can be a partial solution in that they can have built-in data reduction facilities (downsize, crop, etc.) but they typically require application-specific programming. One way of simulating Smart Readers is to use the read field header to only read certain portions of the data set. These headers can be edited by hand using a text editor, or for more sophisticated applications, can be generated automatically using AVS tools. The problem of the existing read field module is that it only provides for a single set of data strides so that you cannot always extract exactly the data you want immediately. Let's look at an example. Suppose you have a 10,000 pixel by 10,000 pixel image on disk and only want to load the center 512x512 area into memory. The naive AVS approach is to create a network like this: READ IMAGE (read the whole image)
|
CROP (minx=4744, maxx=5256, miny=4744, maxy=5256)
|
IMAGE VIEWER
Alternatively, you could instruct the read field module to only read in the middle 512 scanlines into memory and then crop out the pixels you don't need. Here is a simple picture illustrating which areas we want to skip, read, and ignore. scanline 0 -----------------------------
|...........................|
|...........................| skip over this block
|...........................|
scanline 4744 -----------------------------
|crop this | |crop this | read this block
|out.......| |out ......|
scanline 5256 -----------------------------
|...........................|
|...........................| never read this area
|...........................|
scanline 9999 -----------------------------
Here is the field file header which describes this operation. # AVS field file
ndim = 2
dim1 = 10000
dim2 = 5256
nspace = 2
veclen = 1
data = byte
field = uniform
variable 1 file=big_file.raw filetype=binary skip=4744000
The value 4744000 is the byte number of the first pixel we're interested in: skip over 4744 scanlines of 10000 pixels per scanline. Note that dims2=5256, the maximum scanline we're interested in reading. The network for processing this is now: READ FIELD
|
CROP (minx=4744, maxx=5256, miny=0, maxy=511)
|
IMAGE VIEWER
The memory usage goes from 100Mbytes to 5.12 Mbytes. You still have to read in more data than you are interested in, but it is much less than before, and reading in new areas to look at is not that expensive. In three dimensions, the process is similar except that typically you can only crop on planes (K-slices in the orthogonal slicer module) when reading the data in. This is because typically data is stored in contiguous, scanline order on disk. If the data is stored differently than this assumption (interleaved vector values, etc.) then you may have to modify this scheme somewhat. One variation on this is to generate the header files automatically based on something like pointing to a spot in a reduced version of the image. Look at this example: READ IMAGE
|
IMAGE VIEWER
|
GENERATE HEADER
|
READ FIELD
|
CROP
|
IMAGE VIEWER
Here, you would use upstream data from the first Image Viewer to specify which area of the large image you are interested in. The generate header module (you'd have to write this specifically for your application) would generate the header file and then pass the name to the read field module which reads in the new file. Eventually, AVS will have Smart Readers in it as part of the standard release, but until then, there are easy and powerful alternatives.
| |||||