Znote (recipes)
  Get Znote  

Web scraping: GetForm

Performs a Getform authentication by web scraping and retrieves all subscribed users

 

Get Users from GetForm

This example automates the retrieval of all subscribed users on the GetForm service.

Installation

npm package installation

npm i -S playwright

Scrap all users from Getform to file

This code performs a login, then get all users page by page before storing them in a local file.

const USERNAME="XXX@gmail.com";
const PASSWORD="XXX";
const { chromium } = require('playwright');

//const browser = await chromium.launch();
const browser = await chromium.launch({
    headless: false,
    args: ['--window-size=1280,720'],
  });

const page = await browser.newPage();

// login
await page.goto('https://app.getform.io/login');
await page.locator('//*[@id="__layout"]/div/div/form/div[2]/input').fill(USERNAME);
await page.locator('//*[@id="__layout"]/div/div/form/div[3]/input').fill(PASSWORD);
await page.locator('//*[@id="__layout"]/div/div/form/div[4]/button').click();
await sleep2(5000); // waiting ajax query//await page.waitFor(2000);
// go to form
await page.goto("https://app.getform.io/forms/18848");
await sleep2(3000); // waiting ajax query
// get nbr of pages
const footer = await page.$('//*[@id="formScroll"]/div/div/div/div/div[2]/table/div/div/div');
const nbrPages = await footer.evaluate(element => element.children.length);

let allEmails = [];

// load page and get data in table
for (let i = 1; i <= nbrPages; i++) {
    // click on page number
    await page.locator('//*[@id="formScroll"]/div/div/div/div/div[2]/table/div/div/div/a['+i+']').click();
    await sleep2(2000);

    // get data
    const table = await page.$('//*[@id="formScroll"]/div/div/div/div/div[2]/table/div/tbody');
    const emails = await table.evaluate(element => {
        const results = [];
        const array = Array.from(element.children).forEach(td => {
            const email = td.children[1].innerText;
            const submitDate = td.children[3].innerText;
            results.push({email, submitDate})
        });
        return results;
    });
    allEmails = [].concat(allEmails, emails);
}

// write to file
_fs.writeFileSync('/Users/alagrede/Desktop/users.json', JSON.stringify(allEmails));
print("DONE");
//await page.screenshot({ path: '/Users/alagrede/example.png' });
//open("/Users/alagrede/example.png");
await browser.close();

Read file and remove duplicate

Since Getform doesn't remove duplicate users, we deduplicate the results before displaying them.

function uniqByKeepFirst(a, key) {
    let seen = new Set();
    return a.filter(item => {
        let k = key(item);
        return seen.has(k) ? false : seen.add(k);
    });
}

const data = JSON.parse(_fs.readFileSync('/Users/alagrede/Desktop/users.json', 'utf8'));
printJSON(uniqByKeepFirst(data, it => it.email).length);

Related recipes