Register Now
Member Count: 256,877 - July 29, 2010  [Get Time]
Login
forums   
Round Tables
News Discussions
Algorithm Matches
High School Matches
Assembly Contests
Marathon Matches
Software Forums
Sponsor Discussions
Watch Thread  |  My Post History  |  My Watches  |  User Settings
View: Flat (newest first)  | Threaded  | Tree
Previous Thread  |  Next Thread
Forums Round Tables Software Competition Discussion Code Formatter
Code Formatter | Reply
TopCoder is in the process of developing a set of components that will perform code formatting for multiple programming languages. We would like the TopCoder members to assist defining the scope of this application.

Once completed, the components will be used in the arena to clean up code (when viewed during the challenge phase, for example) and may also be used to guarantee standardized code formatting across all components. (for example, code in development submissions might automatically be formatted)

Below are the current goals of this project. These goals may be refined as the project matures. Please look over these goals and let us know if you have any ideas or comments regarding this project.

We will also be looking to outsource a lot of the work involved in this project (similar to the UML Tool) so if anyone is interested in helping out, let me know.

Goals:

The Code Formatter is made up of several components and will be developed following the entire TopCoder application development process: specification, architecture, component development, assembly, testing and deployment.

The Code Formatter formats code written in multiple languages. The following languages are supported: Java, C#, VB.NET and C++.

The Code Formatter uses a parser-based approach to formatting for maximum flexibility. (versus a simple finite-state machine approach) It generates a syntax tree based on the input and generates its output while visiting the nodes in the tree.

The Code Formatter supports a wide variety of configuration options to manipulate the style of the code that it outputs.

The Code Formatter gracefully handles invalid input (input that does not compile in the target language).

The Code Formatter properly handles comments, strings and macros. (comments are carried over to the output and may or may not receive formatting, such as maximum line length enforcement; the formatter does not get confused by keywords in strings; the formatter can handle macros in languages that support them)

The Code Formatter will never alter the code in such a way that it will behave differently than the original version when compiled and executed.
Re: Code Formatter (response to post by FogleBird) | Reply
I guess you are planning to use standard Parser Generators (and their available grammars) for obtaining the syntax trees and then using a translation scheme on that tree.
I was wondering that if you plan to handle invalid input then this may be tricky even for languages with widely available grammar(s) (like Java) as the generated parser would have to be modified to handle invalid input (it may not parse and generate a syntax tree otherwise) but, more importantly, how do you plan to deal with a language like C++ which doesn't have a valid LALR grammar or a grammar for predictive parsing (not sure about ANTLR but this rules out almost all parser generators)?
Even GCC front-end consists of hand crafted code, though it can be a valid option to start from.
Please note that even the code formatters from VStudio, VIM and other tools are really unable to handle invalid input (even from languages with widely available grammars) in a very meaningful way.
Re: Code Formatter (response to post by amit_pundir) | Reply
Compile != Syntactically valid

This is my interpretation of this requirement. Even Microsoft had big problem to implement IntelliSense and refactoring of code that is not valid (i.e. work in progress).
Re: Code Formatter (response to post by amit_pundir) | Reply
Well, actually I guess as he says Code Formatter will consist of multiple components, one of these components will be some hand-crafted parser which is suitable for the task of doing reasonable recognition of the invalid code.
Re: Code Formatter (response to post by FogleBird) | Reply
The project looks interesting to me, I'd be glad to participate.
Some idea for parser probably that it needs to have priority of some grammar constructions. E.g. when we are parsing expression (in Java) and don't have the balance of parentheses meet before getting ';' character, it is good idea to treat it as the end of both the currently parsed expression and currently parsed statement, i.e. statement end should be more prioritized than expression end, etc.
My explanation probably is not too good, but I hope it gives some general idea. I am not sure, but maybe such parser can be implemented using some parser generators, but it will probably need to hack with some of their subtle features, so it really looks like better to make some TC component for doing parsing job.
Re: Code Formatter (response to post by FogleBird) | Reply
I am also interested on participating on this project. Even if it will not work 100% correctly from start, it will provide a lot of help on development competitions to developers. If now we are each using our own tweaks for removing trailing spaces or to verify if there are lines longer than 120 chars, adding spaces before ( for "for" instruction, etc. all those things will be handled by this Checkstyle-like mechanism.
Re: Code Formatter (response to post by amit_pundir) | Reply
Don't most parser generators provide mechanisms to handle errors while parsing? My plan was to use such an error handler to take the incorrect input and output it with no formatting and try to restore to a state where parsing can continue. In the worst case, the output will receive no formatting and will remain unchanged. Can this work?
Re: Code Formatter (response to post by FogleBird) | Reply
So, no one has any requests regarding the functionality of these components? Perhaps a code formatter is one of those things that's just "expected to work" and we take it for granted? =)
Re: Code Formatter (response to post by FogleBird) | Reply
ok, I'll give my requests:

- fulfills TC standard without further editing
- generate suitable header with appropriate year if one is missing
- generate required tag @author etc, some may need more manual editing
- Eclipse-friendly, either as plugin or the configuration can be exported as Eclipse code formatter configuration file
- XML formatter
- HTML formatter
Re: Code Formatter (response to post by FogleBird) | Reply
Don't most parser generators provide mechanisms to handle errors while parsing? My plan was to use such an error handler to take the incorrect input and output it with no formatting and try to restore to a state where parsing can continue. In the worst case, the output will receive no formatting and will remain unchanged. Can this work?
I don't know about "most" parser generators, but yacc and bison support this sort of thing, so it's not novel. I would expect to see it in any reasonably mature parser generator, to the extent that I would probably consider a parser generator immature if it didn't have such support.
Re: Code Formatter (response to post by enefem21) | Reply
Regarding Eclipse - one of issues I hate most of all - then Checkstyle say me about some easy violations - there is no "Quick Fix" option to automatically fix them.

If TC will do this - I will pay 50 USD for this tool ;-)
Re: Code Formatter (response to post by TAG) | Reply
when TAG... not then... :P
Forums Round Tables Software Competition Discussion Code Formatter
Previous Thread  |  Next Thread
RSS