-
Notifications
You must be signed in to change notification settings - Fork 294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[c2cpg] Recognize more source file extensions #5173
base: master
Are you sure you want to change the base?
Conversation
To be consistent with latest cmake and CDT (eclipse-cdt/cdt#422).
@@ -40,9 +40,9 @@ class HeaderAstCreationPassTests extends C2CpgSuite { | |||
case Seq(bar, foo, m, printf) => | |||
// note that we don't see bar twice even so it is contained | |||
// in main.h and included in main.c and we do scan both | |||
bar.fullName shouldBe "bar" | |||
bar.fullName shouldBe "bar:void()" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it would be a regression for C code, in order to improve C++ code. I guess there just is inherent ambiguity with .h
files, whether they contain C or C++ code. But I don't think we can get away with this right now.
Maybe we do header files in a second pass after the regular files, so we can know whether they got included from C or C++ files? Or maybe CDT has some guess-the-file-type magic since the IDE runs into the same problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe CDT has some guess-the-file-type magic since the IDE runs into the same problem?
Sadly no, they also simply use the C++ parser in all cases.
The two-passes approach also won't work in all cases, as one could e.g. include a C header file in a C and C++ source file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or to phrase it differently:
The behaviour without this PR is definitely wrong as it makes parsing CPP code in .h files impossible.
With this PR will are able to parse such code. "Wrong" fullnames for C method declarations that are never implemented in any source file (because there we will create the correct fullname and de-duplicate correctly) should be no issue or do I miss something there?
To be consistent with latest cmake and CDT (eclipse-cdt/cdt#422).
With this PR we parse
.h
header files with the CDT C++ parser. While this is theoretically not correct (one should choose either the C or C++ parser dependent on the source file including it but we can't know this in all cases) we achieve a higher successfull parse ratio for these files. This is also the default in Eclipse CDT (i.e., the IDE).Method de-duplication works by fullName (C++) or fullName + signature (C) now. With this change we are always able to de-duplicate:
1.) C++: the fullName already contains the signature. Hence, safe and the same as before.
2.) C: there is no method overloading anyway. So the methods from .h files are now parsed as C++ methods (see 1). Hence, we need to compare them with their fullName + signature C counterpart.