From GDSII to Oasis
The Oasis file format is intended as the long-term successor to GDSII, which is now 30 years old. However, even though the first specification for Oasis was released in 2004, there is still a great deal of confusion and ignorance surrounding the standard.
The article looks at the background of Oasis’ development, identifies its strengths and weaknesses, and offers some tips and techniques for using Oasis (especially in the transition period from GDSII). It then provides some insights into it’s future usage.
Why switch to Oasis?
GDSII was introduced in 1978 by Calma as the successor to 1971’s GDS design file format. The need for one format to succeed another was recognized back then after just seven years. However, the 30 years since have seen no major change made to GDSII as the de facto standard even though we would all acknowledge the extraordinary rate at which design complexity has increased.
It almost goes without saying that the design databases for today’s digital chips are obese. The physical description of a GDSII-encoded system-on-chip (SoC) can easily exceed 20Gb/s. Indeed, mask houses already talk of files in the 200GB range. Even where storage systems and data links can handle such volumes, the files themselves are still hard to manipulate.
Meanwhile, file size is not the only issue. The numerical values needed to describe the geometries of nanoscale structures on 300mm wafers will soon breach GDSII’s 32bit limit. Given these concerns, the Oasis format was developed. Its first official specification was released in 2004. This article describes how issues with the size and precision of design file encodes are addressed by the new format. It also highlights some of Oasis’ key features and offers some tips on how to get the most benefit from the standard. Our company, Xyalis, has had extensive experience with GDSII manipulation software and has already begun working with Oasis, developing ways to circumvent potential pitfalls and problems posed by the new standard.
How data size reduction works
The first goal of Oasis is to reduce the size of the database. This can be done in several ways, such as file structure optimization, the suppression of redundancies and the compacting of values.
Reducing the size of geometric description
All the geometries that comprise the physical description of a chip are made up of polygons, which are themselves described as lists of coordinates. One type of optimization reduces the number of coordinates needed to describe a polygon, while another reduces the size (in terms of used bytes) of each individual coordinate.
Numerical values
Beyond controlling file sizes, another of Oasis’ goals is overcoming the limited precision of numerical values. These objectives would seem to be at odds with one another. In fact, Oasis stores all numerical values with variable length encoding. The numeric values are split in ‘bytes’ of 7 bits. The eighth bit is used to mark when an additional ‘byte’ is needed. Using this approach, a ‘small’ value will only use 1 byte, while a ‘big’ value will use 4 or more. This has two advantages. First, statistically, most values are small enough to use less than 4 bytes. Second, there is no limitation—at least in the standard: a value may have ‘infinite’ precision.
Polygons
Each polygon is described as a list of coordinates. In GDSII, all coordinates comprise a pair of absolute X and Y values. In Oasis, as small values use less space, each coordinate may be defined relative to the one before it. Because most geometries are made of ‘small’ polygons (compared to chip or wafer size), describing polygons with relative coordinates dramatically reduces data size.
Furthermore, most polygons have standard shapes (e.g., squares, rectangles). Yet GDSII contains no shape-specific description scheme: the description of a polygon description starts at one point and then goes to the next until it gets back to the beginning, with each point represented by successive X and Y coordinates. A simple square then needs five points in GDSII. In Oasis, a square is identified by one point and its size: only three values are required, and at least one of these is almost always small.
Source: Xyalis
Indeed, various types of rectangles or trapezoids can be identified specifically in Oasis. As Figure 1 shows, no less than 25 different types of trapezoids can be described, each using the minimum number of values for its full description. (At this point, we should point out a specific aspect of the Oasis format that allows a rectangle to be described as a ‘rectangle’ or as a specific type of ‘trapezoid’. This kind of variation in encoding makes parsing more complex with Oasis and poses challenges to managing the optimization process.)
Layers
The final parts of the geometry description to be optimized are the layer and data type values. In GDSII, each polygon description includes the layer and the data type numbers. In Oasis, these values are specified only if they differ from the preceding value. Nevertheless, as with other numerical values, layer and data type numbers may have infinite precision. The restriction to 256 values in GDSII is gone. Oasis can accommodate as many layers as an advanced process description requires.
Optimizing geometric repetitions
A statistical analysis of any design will show many repetitions. For example, a simple contact may appear dozens of times in a single small library cell. Oasis allows such multiple occurrences of the same geometry to be instantiated.
Source: Xyalis
Regular arrays
In GDSII, the basic repetition mode for a matrix of cells is a regular array (Figure 2). Oasis extends repetition to other forms, even allowing for it in the description of non-orthogonal arrays (Figure 3). This feature is especially useful in describing metal (or ‘dummy’) fill structures.
Random distribution
In addition to regular arrays, Oasis allows users to instantiate random distributions of the same polygon. Here, such descriptions are followed by a displacement to the first point of the next identical polygon.
Optimization of cell calls
The physical description of any chip is always hierarchical. A top cell calls down to sub cells that are described separately.
Source: Xyalis
Reference
Oasis allows you to refer to a cell in several ways. These include reference-by-name (as in GDSII) and reference-by-index. Different methods of declaring a cell are also available. These include declaration-by-name, declaration-by-index and declaration-by-the-automatic-numbering-of-indices. A declaration-by-index references a line in a table that is stored either at the beginning or at the end of the file.
Multiple instantiation
Oasis extends the use of arrays to cover non-orthogonal cell matrices. These kinds of structures are offered so that users can, for example, efficiently instantiate dummy tiles to improve the CMP yield during manufacturing. By contrast, special care must be taken when generating dummy tiles in GDSII or the size of the database will dramatically increase. In Oasis, the database size remains under control. Another Oasis option lets users specify multiple placements of one cell by giving only the position of each instance, the shortest possible notation.
Embedded compression
Oasis allows the direct compression of some blocks inside a file using a gzip-like format. Such blocks will usually contain full cell descriptions. Each cell will then be independently compressed. Random access to the file remains possible, even if some components are in a compressed format.
Performance
Depending on the database structure and the chosen method of optimization, an Oasis file is between five and 20 times smaller than one encoded in GDSII. Figure 4 gives some sample Oasis compression ratios under various schemes compared to similar GDSII file variations, assuming an untouched GDSII file has a reference value of ‘1’.
Source: Xyalis
(N.B. The gzip and bzip2 compressions schemes appear less efficient for an Oasis source file than a GDSII source. This is mostly because all the numerical values these schemes target are already compacted in a raw Oasis encode. For example, all the unnecessary ‘0’ bytes present in a GDSII file can account for almost 50% of the file. These are removed as standard under Oasis.)
Potential problems
Oasis offers capabilities that promote new manufacturing and design techniques and help control and reduce database sizes, but the format does have some drawbacks.
No restrictions means no limit!
One consequence of removing the 32bit-limit to the precision with which coordinates are described is that, to borrow a phrase, anything goes. Any value can have an infinite precision. This is an interesting feature in terms of fundamental mathematics but not necessarily so desirable for describing a circuit.
In reality, the tools that handle Oasis files will have an internal limit, one set by the hardware architectures on which they run. Moreover, no file should in practice ever need to use values with a precision that exceeds 64bit.
However, this still highlights a potential difficulty. Caution should also be taken before marking 64bit as an internal limit to coordinates because many tools that manipulate Oasis data today still run on 32bit architectures. They simply may not be able to read files created on a 64bit platform, or worse still, the transition could be accompanied by overflows or the conversion of positive coordinates into negative ones. The risks here are not actually very high for coordinates, but are much more significant for other integer values such as cell index or layer numbers.
Tables and indices
As noted earlier, all cells may be referenced through indices. An index is an entry to a table containing a cell’s name and it usually makes referencing quite easy. However, as noted earlier, references in Oasis can be stored in different places (e.g., the beginning or end of the file, or even spread across the whole file). References can then also be made by name. Even if all these options cannot—thankfully—be mixed in the same files, the different possibilities do exist. Therefore, an Oasis reader should be able to accept any kind of reference and must not be optimized for one particular option. One commonly used workaround here is to build the reference table at the end of the file. This makes access, prior to full file parsing, very easy.
However, there is a further wrinkle, if the file is compressed. And most users will compress files as database size is a key issue. However, uncompressing a file can only be done sequentially. GDSII was originally developed to be read and written on tapes, so that presented no problem. But Oasis uses the fact that storage today uses random access mediums and allows direct access to any location in the file. So, the most dramatic improvements in read access times are gained when a file has been decompressed.
Compatibility with GDSII
Oasis is intended to replace the GDSII format, and there will be a coexistence/handover period lasting several years. As a result, companies today have to manage heterogeneous environments and the translation of data between the two formats. This raises several issues.
Layers
GDSII is limited by its number of usable layers (256 layer numbers x 256 data types). Oasis has no limit to its layer numbers. This significantly complicates Oasis-to-GDSII conversion. Some EDA vendors have sought to overcome this problem by accepting layer numbers greater than 256, in an unofficial extension of the GDSII format. This is a useful move, but it has not yet been extended to all design software, so some incompatibilities still arise.
Circles
Oasis allows users to describe circles, whereas GDSII requires that circles are approximated by polygons. Depending on how the polygon is generated (i.e., the number of edges, position of the first point), the resulting GDSII will differ from the original Oasis.
Also, while Oasis is dedicated to mask building, no current mask writing equipment can make a circle without approximating it through polygons. This significantly complicates mask-to-database inspection and verification from an Oasis source.
Vulnerabilities
It has been found that Oasis format files can contain inconsistent, unflagged data. Using a checksum at the end of a file can reduce the risk of data corruption during a transfer, but users must bear in mind that the Oasis standard by itself does not specify how to interpret specific shapes.
Worse still, Oasis files can contain unidentified binary data, and the implication here is exactly as serious as you might expect.
Viruses
The possibility that an undetected piece of binary code can be inserted within an Oasis file with no restrictions on its size or its content, indicates an undeniable vulnerability to viruses, trojans and worms. It would appear that such malware can be introduced even though the file will continue to declare itself clean and specification-compliant.
An Oasis file is not auto-executable, but there are already some cases where viruses have been propagated through pure data files because of lax security on the part of users.
Bad polygons
Oasis has the same limitations as GDSII when it comes to polygons shapes: there are no constraints on the allowed polygon. This can lead to different behaviors, depending on the tool environment, although the following are the issues most frequently encountered:
- twisted polygons;
- self-intersecting polygons; and
- U-turns in path descriptions.
These are shapes that are syntactically correct but which can be interpreted in different ways. They were a major issue in GDSII, and too many chips failed when such configurations arose. The problem should have been dealt with in the Oasis specification but was not. Therefore, careful verification and validation of Oasis files must be undertaken before the reticle is manufactured.
How to get real benefits from Oasis
Here are four basic rules for using a GDSII-to-Oasis converter or developing your own Oasis writer:
- Always reference cells by index. It appears that some files generated after OPC processing contain millions of cells. Referencing cells by name in such configurations will have a very dramatic effect when it comes to parsing the file.
- Avoid most forms of file compression. Even if some readers, including that offered by Xyalis, can directly analyze a compressed file, most current tools will still be slowed down by index access requirements. It is much more efficient to use the embedded gzip feature.
- Keep layer numbers, index and references to the limits set by the GDSII format.
- Carefully analyze binary code for viruses.
Conclusion
The Oasis format breaks a lot of new ground but does not correct all the limitations of GDSII. Indeed, particularly during the transition between the two formats, it adds some potential sources of error and complicates some aspects of mask data preparation.
There are several ways of optimizing design data packages with Oasis, and the results in terms of file size and analysis time can vary significantly. There is, as yet, no single best practice. Consequently, different EDA vendors will offer a range of methods for generating Oasis files.
There is still some resistance to Oasis, even though a number of companies have started to switch to the new format. Others believe that the only real issue related to GDSII is file size, regardless of the new design strategies that Oasis is intended to facilitate. Nonetheless, those concentrating on size believe that extending their disk and RAM capacities still represents a better deal than changing a qualified flow based on GDSII to a new one based on Oasis. However, it is our view that this work to extend GDSII’s limits cannot continue forever.
It is true that the complexity of the Oasis standard and the many options it provides for storing even the same data do mean that the number of possible errors for an Oasis file is greatly higher than that for the GDSII equivalent, at least four times higher. We should also note that GDSII’s weakness regarding polygons shape interpretation has not been corrected in Oasis, and that all databases in the new format demand careful validation.
But is this all that surprising, never mind a deal breaker? It also took many years to correct the most grievous errors that were found in the initial GDSII files generated by different tools. We are just at the beginning of the Oasis revolution, so the fact that detailed checks are needed today is nothing new—the same process built confidence in GDSII originally.
After many years developing tools based around the GDSII format, Xyalis recently released an Oasis format reader. It allows users to check all the critical points in an Oasis file including full specification compliance. It also validates the compatibility across 32/64bit platforms, hunts out badly formed polygons, detects the presence of unidentified binary code and much more. The arrival of this and similar software is necessary in building a platform for the new standard.