I wanted to play around with some array manipulations in JavaScript and for whatever reason when the latest issue of Frontend Focus came into my inbox one of my first thoughts was, I wonder how many words long the top articles end up being.

I have no idea why I wanted to know this (genuinely, I don’t) but I did realise I could grab some statistics quite easily with JavaScript and it would scratch my itch to play with some array manipulations.

The Sample Data

For this article I took the top 6 non-sponsored articles;

The Data I wanted

Here are the stats I (rather arbitrarily) wanted to know;

  • Average word count
  • Average image count
  • Average link count (internal and external)
  • Average lines of code

My General Method

My general method for grabbing all of this information is to find the DOM element we can consider the main content area, use it to hang code off to query for the things I care about. Once I have that information put it into an object that I can then query to get averages.

Everything I’ve written was written and ran in Chrome’s web console.

From SASS To PostCSS

The first article is where I will spend most of my time, because I will be figuring out a lot of the code that I will hopefully be able to copy and paste in future articles.

The first thing we need to do is find what I could consider to be the main post, I use the web inspector and eyeball the results. It didn’t take long to decide that the following code would work;

const mainPost = document.querySelector('.e-content');

Here I am setting a constant called mainPost to be the element returned by running querySelector against the document. The .e-content part is saying look for a class called “e-content”. If you know CSS selectors then any of them will work, if you don’t, fear not, I will explain any new ones I use in this article.

Getting an accurate word length isn’t a trivial task, there are certain things like emojis that you may or may not want to consider words and do you want to consider words that make up figure captions or code samples in your word count?

To make my examples (and life) easier, I’ve decided to treat anything that JavaScript’s textContent returns for a node as a word. Making the code I would need to write be;

mainPost
  .textContent
  .split(' ')
  .filter(word => word.length > 0)
  .length

Which gives us a figure of 1873.

Before I go on, let me talk about what I’ve done here.

I asked the mainPost for its textContent, this returns a massive string with all the text of the post.

I then split this string into an array, I want to split on spaces so in theory I should now have a list of words.

I want to filter this array to only show me non-empty words (note: this could be improved to also filter out other things I don’t want to be considered words).

Now I have a filtered array, which allows me to ask its length.

Now we want to find the images, this is a much smaller bit of code;

mainPost.querySelectorAll('img').length

This code asks the mainPost to query all the things in it for img elements, using querySelectorAll.

This will return a NodeList, which we can ask for its length.

This gives us 3.

Next up I wanted to grab the count of links, I wanted to do this for internal links and external links. Normally internal links are relative, so start with a /, which means we can do something like;

mainPost.querySelectorAll('a[href^="/"]').length

Which gives us 0.

Instead of just asking for instances of an element, we’ve used the attribute selector syntax to say a elements with an attribute of href that starts with /.

We can make one small change in order to find links to external resources;

mainPost.querySelectorAll('a[href^="http"]').length

This gives us 17

The final thing we wanted was lines of code. I decided that this meant lines with pre elements, I eyeballed the content to make sure that was fine, it seems to be.

Array.from(mainPost.querySelectorAll('pre'))
  .reduce((count, codeBlock) => {
    return count + codeBlock
                    .textContent
                    .split('\n')
                    .filter(line => line.length > 0).length
  }, 0)

We are doing quite a few things here.

First, we want to turn our NodeList into an Array this is because you can ask NodeList for things like its length, but you can’t do things like reduce or iterate over a NodeList. We do this with Array.from.

Once we have our array we want to reduce it, reducing is the act of iterating over an object and combining parts of it to turn it into one output, in this case we want to iterate over each pre element and return the count of the lines within that pre element.

The 0 in our code is us seeding the reduce with a count value which we can add stuff to.

To get the amount of lines (which we will add to count) I take the codeBlock, ask for its textContent (like we did for counting words) and split based on \n, which is the new line character. Once I have this array of lines I want to filter over them to remove and empty lines.

Once that mouthful is complete it returns 125, which is the final stat we wanted.

Final Stats

  • WordCount = 1873
  • ImageCount = 3
  • LinkCountInternal = 0
  • LinkCountExternal = 17
  • LinesOfCode = 125

CSS grid is coming

The next article should go a lot more smoothly since we should be able re-use most of our code.

The main difference will be selecting the mainPost, I found a class called .content which would do it, but to be different I wanted to hang it off an element, which in this article’s case was article;

const mainPost = document.querySelector('article.content');

All of the other code was exactly as I had written before, so I won’t bore you with it again.

Final Stats

  • WordCount = 913
  • ImageCount = 1
  • LinkCountInternal = 2
  • LinkCountExternal = 8
  • LinesOfCode = 0

CSS Writing Modes

We are building up a head of steam now and can jump straight into our third article.

Again, all the code was identical bar setting the mainPost;

const mainPost = document.querySelector('.c-article__main');

Final Stats

  • WordCount = 3009
  • ImageCount = 20
  • LinkCountInternal = 0
  • LinkCountExternal = 14
  • LinesOfCode = 46

The Inner Workings of the Virtual DOM

For our fourth article I had to make a some small changes.

The mainPost was set using;

const mainPost = document.querySelector('section[name="072b"]');

When it came to working out internal vs external links I couldn’t do what I had done before because this was posted on Medium. Medium has a subdirectory for the user account and has no notion of relative links for internal content.

To get internal links I went with a small variation, simply asking for content that started with the author’s Medium URL;

mainPost.querySelectorAll('a[href^="https:[email protected]"]').length

The external link is a bit more exciting, I had to use filter;

Array.from(mainPost.querySelectorAll('a'))
  .map(link => link.href)
  .filter(link => !(link.includes('https:[email protected]')))
  .length

What this does is map over all our a elements and return their href attributes.

I then filter these to only return elements that don’t include the author’s Medium URL.

Finally I can get the length of this Array.

Final Stats

  • WordCount = 2047
  • ImageCount = 66
  • LinkCountInternal = 22
  • LinkCountExternal = 26
  • LinesOfCode = 6

Improving the UX of Names with Vocalizer.js

For our fifth article I was able to use the original code again, with one small change because Smashing Magazine references internal links with a fully qualified domain name so I used the same method we did for the Medium post.

const mainPost = document.querySelector('article');

As you can see I was a bit devil may care with this one and just went for the article element.

Final Stats

  • WordCount = 1513
  • ImageCount = 5
  • LinkCountInternal = 16
  • LinkCountExternal = 9
  • LinesOfCode = 49

Learning from Lego: A Step Forward in Modular Web Design

Our final article was a bit of a doozy. I had to write new code for all of this.

There wasn’t one good container that just held the article, it also held a lot of ad/cruft that I didn’t want to count towards anything.

There are a few ways I could have addressed this, I decided to select only the elements I care about;

const mainPost = document.querySelector('.main-content');
const mainContent = mainPost.querySelectorAll('figure, img, p, h2, h3, pre')

This says we have our mainPost but the thing we want to eventually hang other queries off is mainContent, which is a subset of the stuff inside mainPost that we actually care about. I got the list of these elements by eyeballing the source code.

In order to get the word count I had to use map and reduce;

Array.from(mainContent)
  .map(element =>
    element.textContent.split(' ')
    .filter(word => word.length > 0)
    .length
  )
  .reduce((count, element) => { return count + element})

I had to do the same sort of thing to get the image count;

Array.from(mainContent)
  .map(element =>
    element.querySelectorAll('img').length
  ).reduce((count, element) => { return count + element})

Likewise for link grabbing, although I was able to revert back to using just / for my lookup.

Array.from(mainContent)
  .map(element => element.querySelectorAll('a[href^="/"]').length)
  .reduce((count, element) => { return count + element})
Array.from(mainContent)
  .map(element => element.querySelectorAll('a[href^="http"]').length)
  .reduce((count, element) => { return count + element})

This article also had code samples using CodePen. I decided for lines of code to ignore these examples as digging into iframes isn’t fun.

Array.from(mainPost.querySelectorAll('pre'))
  .reduce((count, codeBlock) => {
    return count + codeBlock.textContent.split('\n')
      .filter(line => line.length > 0
    ).length
  }, 0)
  • WordCount = 2424
  • ImageCount = 14
  • LinkCountInternal = 1
  • LinkCountExternal = 10
  • LinesOfCode = 78

Working out the Averages

Great, so now we have this data, I’ve stored it off as 5 Arrays (it could have been stored many different ways).

const wordCount = [1873, 913, 3009, 2047, 1513, 2424];
const imageCount = [3, 1, 20, 66, 5, 14];
const internalLinkCount = [0, 2, 0, 22, 16, 1];
const externalLinkCount = [17, 8, 14, 26, 9, 10];
const linesOfCodeCount = [125, 0, 46, 6, 49, 78];

Which means I can do the following;

wordCount.reduce((total, num) =>  total + num) / 6
imageCount.reduce((total, num) =>  total + num) / 6
internalLinkCount.reduce((total, num) =>  total + num) / 6
externalLinkCount.reduce((total, num) =>  total + num) / 6
linesOfCodeCount.reduce((total, num) =>  total + num) / 6

The meat of this article

HA! I fooled you, you thought this was some technical article with me messing around with some JavaScript.

NOPE. This is a think-piece on the stats you NEED TO FOLLOW in order to get a top position on Frontend Focus.

Unless you have EXACTLY 1963.1666666666667 words in your article with 18.166666666666668 images for good measure, along with 6.833333333333333 internal links and 14 external ones, and of course you need 50.666666666666664 lines of code. You will never get featured and your startup will fail, or worse, get acquired by Yahoo.

Want proof? This article doesn’t conform to those numbers and I bet you won’t be reading this in Frontend Focus anytime soon!