Results
RQ1.1: Stack Overflow Answerers’ Awareness
General information
Table 1: Stack Overflow answerers taken the survey
| Reputation | Sent emails | Answers | Rate |
|---|---|---|---|
| 963,731-6,999 | 607 | 201 | 33% |
Table 2: Experience of Stack Overflow answerers
| Experience | Amount | Percent |
|---|---|---|
| Less than a year | 1 | 0.5% |
| 1 – 2 years | 1 | 0.5% |
| 3 – 5 years | 30 | 14.9% |
| 5 – 10 years | 58 | 28.9% |
| More than 10 years | 111 | 55.2% |
Code snippets in answers
Table 3: Frequency of including code snippets in answers
| Include code snippets | Amount | Percent |
|---|---|---|
| Very Frequently (81–100% of the time) | 84 | 42% |
| Frequently (61–80% of the time) | 63 | 31% |
| Occasionally (41–60% of the time) | 40 | 20% |
| Rarely (21–40% of the time) | 11 | 6% |
| Very Rarely (1–20% of the time) | 2 | 1% |
| Never (0% of the time) | 1 | 1% |
| Total | 201 | 100% |
Figure 2: The sources of code snippets in Stack Overflow answers

Outdated code snippets
Table 4: Notifications of outdated code snippets in answers
| Notified of outdated code | Amount | Percent |
|---|---|---|
| Very frequently (81–100% of my answers) | 2 | 1% |
| Frequently (61–80% of my answers) | 1 | 0.5% |
| Occasionally (41–60% of my answers) | 9 | 4.5% |
| Rarely (21–40% of my answers) | 16 | 8% |
| Very rarely (1–20% of my answers) | 103 | 51.5% |
| Never (0% of my answers) | 69 | 34.5% |
License of code snippets
Table 5: Inclusion of software license in answer
| Include license? | Amount |
|---|---|
| No. | 197 |
| Yes, in code comment | 1 |
| Yes, in text surrounding the code | 2 |
| Total | 200 |
Table 6: Checking for licensing conflicts with CC BY-SA 3.0
| Check license conflicts? | Amount | Percent |
|---|---|---|
| Very Frequently (81–100% of the time) | 14 | 7% |
| Frequently (61–80% of the time) | 7 | 3.5% |
| Occasionally (41–60% of the time) | 10 | 5% |
| Rarely (21–40% of the time) | 16 | 8% |
| Very rarely (1–20% of the time) | 15 | 7.5% |
| Never (0% of the time) | 138 | 69% |
| Total | 200 | 100% |
Answer to RQ 1.1
RQ1.2: Stack Overflow Visitors’ Awareness
General information
Twenty-four (27%) and twenty-one (24%) participants have over 10 years and 5–10 years of experience respectively. There are 19 participants (21%) who have 3–5 years, 18 (20%) who have 1-2 years, and 7 (8%) participants who have less than a year of programming experience.
Table 7: Problems from Stack Overflow code snippets
| Problem | Amount |
|---|---|
| Mismatched solutions | 40 |
| Outdated solutions | 39 |
| Incorrect solutions | 28 |
| Buggy code | 1 |
Table 8: Frequency of reporting the problems to Stack Overflow posts
| Report? | Amount | Percent |
|---|---|---|
| Very Frequently (81–100% of the time) | 1 | 1.8% |
| Frequently (61–80% of the time) | 1 | 1.8% |
| Occasionally (41–60% of problematic snippets) | 3 | 5.3% |
| Rarely (21–40% of problematic snippets) | 8 | 14.0% |
| Very rarely (1–20% of problematic snippets) | 8 | 14.0% |
| Never (0% of problematic snippets) | 36 | 63.2% |
| Total | 57 | 100% |
Table 9: Check for licensing conflicts before using Stack Overflow snippets
| License check? | Amount | Percent |
|---|---|---|
| Very frequently (81–100% of the time) | 0 | 0.0% |
| Frequently (61–80% of the time) | 7 | 8.1% |
| Occasionally (41–60% of the time) | 6 | 6.9% |
| Rarely (21–40% of the time) | 6 | 6.9% |
| Very rarely (1–20% of the time) | 11 | 12.6% |
| Never (0% of the time) | 57 | 65.5% |
| Total | 87 | 100% |
Answer to RQ 1.2
RQ2: Online Code Clones
Table 10: Investigated online clone pairs and corresponding snippets and Qualitas projects
| Set | Pairs | Snippets | Projects | Cloned ratio |
|---|---|---|---|---|
| Reported clones | 2,289 | 460 | 59 | 53.28% |
| TP from manual validation | 2,063 | 443 | 59 | 54.09% |
Answer to RQ 2
RQ3: Patterns of Online Code Cloning
Table 11: Classifications of online clone pairs
| Set | QS | SQ | EX | UD | BP | IN | AC | Total |
|---|---|---|---|---|---|---|---|---|
| Before consolidation | 247 | 1 | 197 | 107 | 1,495 | 16 | 226 | 2,289 |
| After consolidation | 153 | 1 | 109 | 65 | 216 | 9 | 53 | 606 |
Answer to RQ 3
RQ4: Outdated Online Code Clones
Figure 1: Outdated QS online clone pairs group by projects

Table 12: Six code modification types found when comparing the outdated clone pairs to their latest versions
| Modification | Occurrences |
|---|---|
| Statement modification | 50 |
| Statement addition | 28 |
| Statement removal | 18 |
| Method signature change | 16 |
| Method rewriting | 15 |
| File deletion | 14 |
Table 13: Examples of the outdated QS online clones (see full results in the Interactive menu)
| Post | Date | Project | File | Start | End | Date | Issue ID | Type* | Date |
|---|---|---|---|---|---|---|---|---|---|
| 2513183 | 25/3/10 | eclipse | GenerateTo StringAction.java |
113 | 166 | 5/6/13 | Bug 439874 | S | 17/3/15 |
| 22315734 | 11/3/314 | hadoop | Writable Comparator.java |
44 | 54 | 25/8/11 | HADOOP-11323 | S | 20/11/14 |
| 23520731 | 7/5/14 | hibernate | Schema Update.java |
115 | 168 | 22/5/13 | HHH-10458 | S | 5/2/16 |
| 18232672 | 14/8/13 | log4j | SMTP Appender.java |
207 | 228 | 31/3/10 | Bug 44644 | R | 18/10/08 |
| 17697173 | 17/7/13 | lucene | SlowSynonym FilterFactory.java |
38 | 52 | 6/4/13 | LUCENE-4095 | D | 31/5/12 |
| 21734562 | 12/2/14 | tomcat | Form Authenticator.java |
51 | 61 | 4/8/10 | BZ 59823 | R | 4/8/16 |
| 12593810 | 26/9/12 | poi | Workbook Factory.java |
49 | 60 | 7-Dec-09 | 57593 | R | 30/4/15 |
| 8037824 | 7/11/11 | jasper reports |
JRVerifier.java | 1221 | 1240 | 31/5/10 | N/A | D | 20/5/11 |
| 3758110 | 21/9/10 | spring | Default Annotation Handler Mapping.java |
78 | 92 | 20/10/10 | SPR-14129 | D | 20/1/12 |
| 14019840 | 24/12/12 | struts | Default ActionMapper.java |
273 | 288 | 17-Jul-10 | WW-4225 | S | 18/10/13 |
Note: S: modified/added/deleted statements, D: file has been deleted, R: method has been rewritten completely
Answer to RQ 4
RQ5: Software Licensing Violation
Table 14: License mapping of online clones (file-level)
| Type | Qualitas | Stack Overflow (CC BY-NC-SA) | QS | EX | UD |
|---|---|---|---|---|---|
| Compatible | Apache-2 | Apache-2 | 1 | ||
| EPLv1 | EPLv1 | 2 | 1 | ||
| Proprietary | Proprietary | 2 | |||
| Sun Microsystems | Sun Microsystems | 3 | |||
| No license | No license | 20 | 9 | 2 | |
| No license | CC BY-SA 3.0 | 1 | |||
| Total | 23 | 15 | 3 | ||
| Incompatible | AGPLv3/3+ | No license | 1 | 4 | |
| Apache-2 | No license | 46 | 14 | 12 | |
| BSD/BSD3 | No license | 4 | 1 | ||
| CDDL or GPLv2 | No license | 6 | |||
| EPLv1 | No license | 10 | 6 | ||
| GPLv2+/3+ | No license | 8 | 48 | 7 | |
| LesserGPLv2.1+/3+ | No license | 16 | 9 | ||
| MPLv1.1 | No license | 1 | |||
| Oracle | No license | 3 | |||
| Proprietary | No license | 1 | 2 | ||
| Sun Microsystems | No license | 1 | 2 | ||
| Unknown | No license | 11 | |||
| LesserGPLv2.1+ | New BSD3 | 1 | |||
| Total | 86 | 78 | 50 |