C Parser (Front End)
The C parser (front end) enables the construction of C custom compilers, analysis tools, or source transformation tools. It is a member of SD's family of language front ends, based on first-class infrastructure (DMS) for implementing such custom tools. The C front end includes:
- Lexical analysis including ASCII, EBCDIC, ISO 8859-1, UTF-8 and 16, and Japanese Shift-JIS
- Conversion of literal values (numbers, escaped strings) into native values to enable easy computation over literal values
- String literals represented internally in Unicode to support 16-bit characters
- Explicit grammar directly implements defacto and real standards and extensions
- Full C (ISO 9899:1990) parser
- Option for C99 (ISO 9899:1999) dialect
- Option for C11 (ISO 9899:2011) dialect
- Option for GNU C (GCC2/GCC3/GCC4/GCC5.0 including vector extensions) dialects
- Option for Microsoft Visual6 C dialect
- Easy extension for other dialects
- Preprocessor support
- Controllable include directory paths
- Option to fully expand preprocessor directives
- Option to parse include files for definitions
- Option to parse preserving preprocessor conditional directives, macros and include directives
- Automatic construction of complete abstract syntax tree
- Capture of comments and formats (shape) of literal values
- Capture of ambiguous parses during parsing
- Ability to parse large systems of files into same workspace, enabling interprocedural and cross-file analysis/transformation
- Ability to parse different languages into same workspace, enabling cross-language analysis/transformation
- Facilities to process syntax trees
- Complete procedural API to visit/query/update/construct/print syntax trees
- Source regeneration by prettyprinting and/or fidelity printing of syntax trees with comments and lexical formats
- Automatically generated source-to-source transformation system
- Ability to define custom attribute-grammar-based analyzers
- Name and Type resolution
- Type representation system for all C types defined
- All identifiers resolved to their C-defined type and stored in symbol tables
- Automatic deletion of erroneous alternatives of ambiguous parses
- Ability to condition transforms on identifier type
- Abilility to visit/query/update symbol tables
- Control flow graph extraction for each compilation unit
- Constructed for each function definition
- Ties control flow nodes to ASTs
- Exposes sequence points
- Computes Post-dominators
- Computes Control Dependences (Sample control flow graph)
- Application call graph extraction across all compilation units
- Qualifies function pointers by "address taken" and argument types
- Computes Transitive "Has-side-effect" information
- Data Flow Analysis support
- Forward and Backward Iterative Flow analyzers
- Reaching definitions for scalar values, struct members, array elements, structs and arrays
(Sample reaching definitions graph) - Use-definition chains
- Definition-use chains
- Reachable-uses analysis
- Available as source code to enable complete customization
- Means to manage multiple language dialects with highly shared common core
- Robustness due to careful testing and application across many customers
Many of these facilities come as a consistent consequence of the front end being built on top of DMS.
Here are some sample tools (many offered by SD as products) built using the C front end:
- Source Code Search Engine
- Preprocessor conditional simplification given fixed assertions about #defined identifiers.
- Source Formatter
- Obfuscator
- Test Coverage
- Profiler
- Duplicate Code Detection and removal
Your organization may use DMS with the C front end to implement and deploy your own custom tools. The sample tools can be obtained in source form as part of the C front end for customization. Semantic Designs is also willing to build custom tools under contract.