How to Read Websites in SwiftUI — Data Scraping in iOS

Web scraping in SwiftUI made easy

Ege Sucu
4 min readMay 10, 2022

We’re living in a century in which using API’s common. We, as mobile developers, are used to encoding and decoding JSON data to run our app with the server. Sadly, not all websites/services provide API. Sometimes, you need to read the website to get what you want. We call this Data Scraping. It consists of acquiring data on the page with the help of filtering CSS selectors. How can we get data in our SwiftUI app?

Library

For this purpose, we need a tool that can parse and read HTML body properly. I chose SwiftSoup, a 100% Swift library, to work on this. You can add it as a CocoaPods pod or Swift Package, as it’s my priority.

Project

As for today's example, I would like to create a simple Blog Reader app that parses swiftbysundell.com’s articles. Let’s investigate its HTML scheme by right-clicking and seeing its code.

What we care for is, All articles are summed up inside of <ul class=”item-list> and they have the body of <article>. Inside an article, we have an <H1> as our title, a date inside of <span class=”date”> and a URL of the article with <a href=””>. We need to scrap those data to create our article model.

Structure the App

First, I create a simple struct that will hold our data. I’ll make it identifiable and hashable to properly work with SwiftUI’s List & ForEach Views.

Creating the View

Next, I’ll create the view. For this, I will make a searchable list to navigate to the article’s page in Safari when I click the cell.

First, I create a result array that will hold our articles, depending on the search term. The search will not only look up the article’s title but also its date.

After this, I will create a section structure that will show a header and footer. There will be two structures for today's posts and previous posts.

Next, I’ll create a simple fetch function that will fetch data from our dataModel.

ArticleCell is an easy sub-view in which I’ll show the data.

I also wrote some Date extensions that will help me filter today's posts from older ones and format the date how I want it to look.

Writing the Data Service

Now we need to implement our Data Service. I’ll create an ObservableObject with a Published variable named articleList and a baseURL which I’ll use.

I’ll write a fetchArticles function that will

  • erase the array. It’s filled,
  • get the whole website as a string,
  • and parse the string as HTML with the help of Swiftsoup.

We’ll do the 2nd, and 3rd steps using a do catch block since these operations could throw an error.

let articles = try document.getElementsByClass(“item-list”).select(“article”)

This will navigate us into the article array of the website. Note that there will be an array with document.getElementsByClass call, so we’ll do a for loop to treat every data separately.

let title = try article.select(“a”).first()?.text(trimAndNormaliseWhitespace: true) ?? “”

This will select the tag and get the text inside it without any whitespace if there’s any.

let url = try baseURL.appendingPathComponent(article.select(“a”).attr(“href”))

This will get the URL we need.

let dateString = try article.select(“div”).select(“span”).text().replacingOccurrences(of: “Published on “, with: “”).replacingOccurrences(of: “Remastered on “, with: “”).replacingOccurrences(of: “Answered on “, with: “”).trimmingCharacters(in: .whitespacesAndNewlines)

This long code will fetch the data as a String. Since it could include some texts, we need to strip them away and clear any non-visible white spaces.

I would also convert this String as a date, thus I apply the DateFormatter.

let formatter = DateFormatter(dateFormat: “dd MMM yyyy”)let date = Calendar.current.startOfDay(for: formatter.date(from: dateString) ?? Date.now)

At last, I’ll create my data and append it to the model.

let post = Article(title: title, url: url, publishDate: date)self.articleList.append(post)

The service class will look like this.

That’s it. Our app is working fine. You can check out the repo for the whole project.

The Result

Today, you learn how to scrap data to create your data source. You can use this technique to get data from the website or create your blog app. Have any questions? Feel free to ask in the comment section down below. Have a nice day.

--

--