Question

I have a large code base and there is lots of repeated, or nearly repeated code all over the place, it's about as unDRY as code can get, but tracking the "duplicates" is hard, so I was wondering if there are any tools for finding potential DRYable code, something like a diff tool or a Hamming distance analizer, don't need language specific knowledge or anything like that.

So any clues as too a tool like this?

Was it helpful?

Solution

Duplo (open source) works in C, C++, Java, C# and VB.Net. I tried it once, and it found enough duplicated code to keep me employed for a long time.

I've heard of Simian (commercial) but have not tried it.

OTHER TIPS

If you're working in ruby, then you can try this.

I use Simian in VS. It's pretty good, not great.

Clone Dr from Semantic Designs is a commercial product that finds duplicate code in a large number of different programming languages. http://www.semdesigns.com/Products/Clone/index.html

Large companies can afford this product. Individuals ... not so much. I wish there were some open source projects out there like this. Might be a fun project to work on. If we only knew of a community of programmers with some time on their hands ...

Semantic Designs' CloneDR find exact and near-miss duplicate clones based on the langauge structure, so it isn't fooled by whitespace changes or line breaks, inserted/changed comments, or even modified variable names.

It leverages production parser front ends to work with C, C++, C#, Java, COBOL, PHP, Python, Fortran, Ada, ...

There are a number of example Clone analysis reports at the web site for various languages.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top