Как стать автором

Only 39% of the functions in node_modules are unique in the default Angular project

Время на прочтение21 мин
Количество просмотров2.7K
Автор оригинала: Pavel Gurov

Only 39% of the functions in node_modules are unique in the default Angular project created by ng new my-app.

I think the developers of open source solve problems in the same ways, because they study the same algorithms. Well, why be honest, they copy the popular solutions from StackOverflow also.

How to compare functions in Javascript?

If you want to compare functions in Javascript, then just convert them to string form:

const a = () => 'hi';
a.toString(); // "() => 'hi'"

If variable names are different

It is necessary to bring them to the same form using the UglifyJs library. UglifyJs will minimizes functions, remove unnecessary strings, and make simple calculations.

How do you extract functions from a Javascript file?

For this I used the Esprima library. We parse the file and walk an AST tree. I think that it would be possible only with Uglifyjs, but something was wrong and I was too lazy to figure it out.

Action plan

  1. Iterate through all the * .js files in the directory
  2. Parse each file and extract functions of types ArrowFunctionExpression,FunctionExpression and FunctionDeclaration
  3. Compress each of the functions using UglifyJs and write it to a file whose name is the hash of the function
  4. Put the id, path and hash to the separate file info.csv
  5. Load the file info.csv intoSQLite and make all kinds of queries, because this database is not a toy!

Implementation details

  • I named all arrow functions and functional expressions z;
  • I renamed the usual functions to MORK, but kept the original names separately, because the function may be the same, but have a different name;
  • Perhaps with these renames I lost some of the statistics related to recursive functions, well, okay!

Extracting functions from a file

Listing of the file from which we will extract functions:

(function () {
    const arbuz = (test) => {
        function apple(t) {
            function test () {
                return 'ttt';
            return t + 3;
        const aa = 1;
        const b1 = () => 2;
        // comment
        return aa + b1() + apple(test);
    return arbuz;

Note that some of the expressions are calculated, that's comfortable for function comparing. The list of the extracted functions:

"const z=function(){return n=>{return 3+(n+3)}};";
"const z=n=>{return 3+(n+3)};";
"function MORK(n){return n+3}";
"function MORK(){return"ttt"}";
"const z=()=>2;";

Full script can be found here

First research object: node_modules of the default Angular 11 project

So, we have to create a project using @angular/cli: ng new my-app and run our script to parse node_modules. It can take a lot of time.

View package.json
  "name": "my-app",
  "version": "0.0.0",
  "scripts": {
    "ng": "ng",
    "start": "ng serve",
    "build": "ng build --prod",
    "test": "ng test",
    "lint": "ng lint",
    "e2e": "ng e2e"
  "private": true,
  "dependencies": {
    "@angular/animations": "~11.2.10",
    "@angular/common": "~11.2.10",
    "@angular/compiler": "~11.2.10",
    "@angular/core": "~11.2.10",
    "@angular/forms": "~11.2.10",
    "@angular/platform-browser": "~11.2.10",
    "@angular/platform-browser-dynamic": "~11.2.10",
    "@angular/router": "~11.2.10",
    "rxjs": "~6.6.0",
    "tslib": "^2.0.0",
    "zone.js": "~0.11.3"
  "devDependencies": {
    "@angular-devkit/build-angular": "~0.1102.9",
    "@angular/cli": "~11.2.9",
    "@angular/compiler-cli": "~11.2.10",
    "@types/jasmine": "~3.6.0",
    "@types/node": "^12.11.1",
    "codelyzer": "^6.0.0",
    "jasmine-core": "~3.6.0",
    "jasmine-spec-reporter": "~5.0.0",
    "karma": "~6.1.0",
    "karma-chrome-launcher": "~3.1.0",
    "karma-coverage": "~2.0.3",
    "karma-jasmine": "~4.0.0",
    "karma-jasmine-html-reporter": "^1.5.0",
    "protractor": "~7.0.0",
    "ts-node": "~8.3.0",
    "tslint": "~6.1.0",
    "typescript": "~4.1.5"

The package-lock.json here


The node_modules folder contains 26982 * .js files:

$ find . -name '*.js' | wc -l

Number of functions found: 338230

sqlite> select count(*) from info;

Including unique: 130886

sqlite> Select count(*) from (SELECT hash, count(id) as c FROM info group By hash);

Whence it follows that 130886/338230 * 100% = 39% of the functions are really unique, and the rest are duplicates of existing ones.

You can download the csv file for self-checking here.

Function with the largest number of duplicates.

SELECT hash, count(id) as c FROM info group By hash order by c desc LIMIT 20;

# id number of duplicates
1 285d00ca29fcc46aa113c7aefc63827d 2730
2 cf6a0564f1128496d1e4706f302787d6 1871
3 12f746f2689073d5c949998e0216f68a 1174
4 7d1e7aad635be0f7382696c4f846beae 772
5 c2da306af9b041ba213e3b189699d45c 699
6 c41eb44114860f3aa1e9fa79c779e02f 697
7 5911b29c89fa44f28ce030aa5e433327 691
8 05c2b9b254be7e4b8460274c1353b5ad 653
9 fcaede1b9e574664c893e75ee7dc1d8b 652
10 e743dd760a03449be792c00e65154a48 635
11 777c390d3cc4663f8ebe4933e5c33e9d 441
12 27628ad740cff22386b0ff029e844e85 385
13 f6822db5c8812f4b09ab142afe908cda 375
14 d98a03a472615305b012eceb3e9947d5 330
15 4728096fca2b3575800dafbdebf4276a 324
16 7b769d3e4ba438fc53b42ad8bece86ba 289
17 7d6f69751712ef9fa94238b38120adc6 282
18 b7081aad7510b0993fcb57bfb95c5c2c 255
19 d665499155e104f749bf3a67caed576a 250
20 99fa7dfce87269a564fc848a7f7515b9 250

  1. 285d00ca29fcc46aa113c7aefc63827d, 2730 identical

    const z=function(){};

  2. cf6a0564f1128496d1e4706f302787d6, 1871 identical, function names usually: __export

    function MORK(r){for(var o in r)exports.hasOwnProperty(o)||(exports[o]=r[o])}

  3. 12f746f2689073d5c949998e0216f68a, 1174 identical, function names_interopRequireDefault и __importDefault

    function MORK(e){return e&&e.__esModule?e:{default:e}}

  4. 7d1e7aad635be0f7382696c4f846beae, 772 identical

    function MORK(){}

  5. c2da306af9b041ba213e3b189699d45c, 699 identical

    const z=function(o,_){o.__proto__=_};

  6. c41eb44114860f3aa1e9fa79c779e02f, 697 identical, functions name: __

    function MORK(){this.constructor=d}

  7. 5911b29c89fa44f28ce030aa5e433327, 691 identical

    const z=function(n,o){for(var r in o)o.hasOwnProperty(r)&&(n[r]=o[r])};

  8. 05c2b9b254be7e4b8460274c1353b5ad, 653 identical

    const z=function(t,n){return extendStatics=Object.setPrototypeOf||{__proto__:[]}instanceof Array&&function(t,n){t.__proto__=n}||function(t,n){for(var o in n)n.hasOwnProperty(o)&&(t[o]=n[o])},extendStatics(t,n)};

  9. fcaede1b9e574664c893e75ee7dc1d8b, 652 identical

    const z=function(t,o){function e(){this.constructor=t}extendStatics(t,o),t.prototype=null===o?Object.create(o):(e.prototype=o.prototype,new e)};

  10. e743dd760a03449be792c00e65154a48, 635 identical

    function(){var r=function(t,o){return(r=Object.setPrototypeOf||{__proto__:[]}instanceof Array&&function(t,o){t.__proto__=o}||function(t,o){for(var n in o)o.hasOwnProperty(n)&&(t[n]=o[n])})(t,o)};return function(t,o){function n(){this.constructor=t}r(t,o),t.prototype=null===o?Object.create(o):(n.prototype=o.prototype,new n)}};

  11. 777c390d3cc4663f8ebe4933e5c33e9d, 441 identical, function names usually: Rule, AsapScheduler, ComplexOuterSubscriber и другие

    function MORK(){return null!==_super&&_super.apply(this,arguments)||this}

  12. 27628ad740cff22386b0ff029e844e85, 385 identical, function names usually: identity, forwardResolution и тд

    function MORK(n){return n}

  13. f6822db5c8812f4b09ab142afe908cda, 375 identical

    const z=function(n){};

  14. d98a03a472615305b012eceb3e9947d5, 330 identical

    const z=function(n,c){};

  15. 4728096fca2b3575800dafbdebf4276a, 324 identical

    const z=function(n){return n};

  16. 7b769d3e4ba438fc53b42ad8bece86ba, 289 identical, function names:plural

    function MORK(t){var r=Math.floor(Math.abs(t)),t=t.toString().replace(/^[^.]*\.?/,"").length;return 1===r&&0===t?1:5}

  17. 7d6f69751712ef9fa94238b38120adc6, 255 identical

    const z=function(){return this};

  18. b7081aad7510b0993fcb57bfb95c5c2c, 250 identical

    const z=function(){return!1};

  19. d665499155e104f749bf3a67caed576a, 250 identical

    const z=function(n){return null==n};

  20. 99fa7dfce87269a564fc848a7f7515b9, 255 identical

    const z=function(a,c){this._array.forEach(a,c)};

Files with the largest number of functions

SELECT count(id) as c, path FROM info group By path order by c desc LIMIT 20;

quantity file
13638 typescript/lib/tsserver.js
13617 typescript/lib/tsserverlibrary.js
12411 typescript/lib/typescriptServices.js
12411 typescript/lib/typescript.js
12411 @schematics/angular/third_party/github.com/Microsoft/TypeScript/lib/typescript.js
10346 sass/sass.dart.js
8703 typescript/lib/typingsInstaller.js
8528 typescript/lib/tsc.js
3933 @angular/compiler/bundles/compiler.umd.js
3803 @angular/compiler/bundles/compiler.umd.min.js
2602 selenium-webdriver/lib/test/data/js/tinymce.min.js
2264 @angular/core/bundles/core.umd.js
2028 @angular/core/bundles/core.umd.min.js
1457 terser/dist/bundle.min.js
1416 rxjs/bundles/rxjs.umd.js
1416 @angular-devkit/schematics/node_modules/rxjs/bundles/rxjs.umd.js
1416 @angular-devkit/core/node_modules/rxjs/bundles/rxjs.umd.js
1416 @angular-devkit/build-webpack/node_modules/rxjs/bundles/rxjs.umd.js
1416 @angular-devkit/build-angular/node_modules/rxjs/bundles/rxjs.umd.js
1416 @angular-devkit/architect/node_modules/rxjs/bundles/rxjs.umd.js

What about the production js bundle?

Nothing really interesting. The Webpack works efficiently. 95% functions are unique of the 1282 used. Here are top five functions that have duplicates:

quantity function
11 const z=function(){};
10 const z=()=>R;
8 const z=function(n){return new(n||t)};
6 const z=function(n){};
5 const z=()=>{};

What about React?

I also checked React. I put the comparison in the table below:

In the node_modules Angular React
total * .js files 26982 23942
total functions 338230 163385
unique functions 130886 92766
% of unique functions 39% 57%

You can download the csv file for self-checking here.

The react project was generated by create-react-app my-app. The files package.json and yarn.lock are here.

# id number of duplicates
1 12f746f2689073d5c949998e0216f68a 1377
2 285d00ca29fcc46aa113c7aefc63827d 1243
3 3f993321f73e83f277c20c178e5587b9 989
4 54782ec6cef850906484808b86946b33 299
5 7d1e7aad635be0f7382696c4f846beae 278
6 d11004e998280b565ad084b0ad5ca214 239
7 a02c66d8928b3353552e4804c6714326 237
8 79e9bd3cdf15cf0af97f73ccaed50fa0 231
9 7d6f69751712ef9fa94238b38120adc6 189
10 b8dd34af96b042c23a4be7f82c881fe4 176
11 863a48e36413feba8bb299623dbc9b20 174
12 2482d2afd404031c67adb9cbc012768b 174
13 4728096fca2b3575800dafbdebf4276a 170
14 bf8b05684375b26205e50fa27317057e 157
15 fd114ee6b71ee06738b5b547b00e8102 156
16 df1c43e5a72e92d11bdefcead13a5e14 156
17 094afc30995ff28993ec5326e8b3c4d4 156
18 042490db7093660e74a762447f64f950 156
19 5c5979ec3533f13b22153de05ffc64d5 154
20 50645492c50621c0847c4ebd1fdd65cd 154

  1. 12f746f2689073d5c949998e0216f68a, 1377 identical, function names usually: _interopRequireDefault

    function MORK(e){return e&&e.__esModule?e:{default:e}}

  2. 285d00ca29fcc46aa113c7aefc63827d, 1243 identical

    const z=function(){};

  3. 3f993321f73e83f277c20c178e5587b9, 989 identical

    const z=function(){return data};

  4. 54782ec6cef850906484808b86946b33, 299 identical

    const z=()=>{};

  5. 7d1e7aad635be0f7382696c4f846beae, 278 identical, function names: emptyFunction, Generator

    function MORK(){}

  6. d11004e998280b565ad084b0ad5ca214, 239 identical

    const z=function(){return cache};

  7. a02c66d8928b3353552e4804c6714326, 237 identical, function names: _getRequireWildcardCache

    function MORK(){if("function"!=typeof WeakMap)return null;var e=new WeakMap;return _getRequireWildcardCache=function(){return e},e}

  8. 79e9bd3cdf15cf0af97f73ccaed50fa0, 231 identical, function names _interopRequireWildcard

    function MORK(e){if(e&&e.__esModule)return e;if(null===e||"object"!=typeof e&&"function"!=typeof e)return{default:e};var t=_getRequireWildcardCache();if(t&&t.has(e))return t.get(e);var r,n,o={},c=Object.defineProperty&&Object.getOwnPropertyDescriptor;for(r in e)Object.prototype.hasOwnProperty.call(e,r)&&((n=c?Object.getOwnPropertyDescriptor(e,r):null)&&(n.get||n.set)?Object.defineProperty(o,r,n):o[r]=e[r]);return o.default=e,t&&t.set(e,o),o}

  9. 7d6f69751712ef9fa94238b38120adc6, 189 identical

    const z=function(){return this};

  10. b8dd34af96b042c23a4be7f82c881fe4, 176 identical

    const z=function(n,o,c,i){n[i=void 0===i?c:i]=o[c]};

  11. 863a48e36413feba8bb299623dbc9b20, 174 identical

    const z=function(e,n,t,o){void 0===o&&(o=t),Object.defineProperty(e,o,{enumerable:!0,get:function(){return n[t]}})};

  12. 2482d2afd404031c67adb9cbc012768b, 174 identical

    const z=function(){return m[k]};

  13. 4728096fca2b3575800dafbdebf4276a, 170 identical

    const z=function(n){return n};

  14. bf8b05684375b26205e50fa27317057e, 157 identical

    const z=s=>exposed.has(s);

  15. fd114ee6b71ee06738b5b547b00e8102, 156 identical

    const z=(r,e,p)=>{var t=makeWrapper(r);return exports.setup(t,r,e,p)};

  16. df1c43e5a72e92d11bdefcead13a5e14, 156 identical

    const z=t=>utils.isObject(t)&&t instanceof Impl.implementation;

  17. 094afc30995ff28993ec5326e8b3c4d4, 156 identical

    const z=i=>utils.isObject(i)&&utils.hasOwn(i,implSymbol)&&i[implSymbol]instanceof Impl.implementation;

  18. 042490db7093660e74a762447f64f950, 156 identical

    const z=(r,e,t)=>{t=exports.create(r,e,t);return utils.implForWrapper(t)};

  19. 5c5979ec3533f13b22153de05ffc64d5, 154 identical

    const z=function(e){if(e&&e.__esModule)return e;var t={};if(null!=e)for(var r in e)"default"!==r&&Object.prototype.hasOwnProperty.call(e,r)&&__createBinding(t,e,r);return __setModuleDefault(t,e),t};

  20. 50645492c50621c0847c4ebd1fdd65cd, 154 identical

    const z=function(e,n){Object.defineProperty(e,"default",{enumerable:!0,value:n})};

Which files use function 8 (79e9bd3cdf15cf0af97f73ccaed50fa0)

View file list

The list is long



Perhaps this little research will give someone thoughts about refactoring. Or changing the programming approach.

It is good that people use some common approaches and patterns for programming.

It's bad that there is a lot of copy-paste.


Used scripts can be found on GitHub





Ближайшие события