♊️ GemiNews 🗞️
🏡
📰 Articles
🏷️ Tags
🧠 Queries
📈 Graphs
☁️ Stats
💁🏻 Assistant
Demo 1: Embeddings + Recommendation
Demo 2: Bella RAGa
Demo 3: NewRetriever
Demo 4: Assistant function calling
Editing article
Title
Summary
Content
<h3>No, really, it is…</h3><p>I’ve been using an RSS aggregator as my primary method of consuming news for about 10 years, so now that I’m learning how to do more with Ruby than just execute blocks of code, I thought I’d dig into how I can use Ruby to interact with an RSS feed and see just what I can do with it. As it turns out, it’s really quite simple.</p><h3>What is RSS?</h3><p>RSS, created by a couple engineers at Netscape back in the days when Netscape was still a thing, originally stood for Rich Site Summary (thanks, <a href="https://en.wikipedia.org/wiki/RSS">Wikipedia</a>!) but is now known as Really Simple Syndication and it, along with the very similar Atom, is a standardized format for providing a feed of a website’s content, generally to be used by a feed aggregator, such as <a href="https://feedly.com/">Feedly</a> or the late, lamented Google Reader.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/262/1*vUkpL86j98HEOJaWSCJZLg.png" /><figcaption>There’s an xkcd for everything.</figcaption></figure><p>If you’re not sure why this would be useful, well…just try keeping up with the content of a few dozen pages without using an aggregator. Sure, you could follow them all on Twitter or Google Plus (LOL), as long as you’re cool with an algorithm deciding which posts actually show up in your feed, but a good RSS aggregator will give you total control over your newsfeed and ensure that you never miss a post, to say nothing of the ability to turn a Craigslist or eBay search into a custom RSS feed.</p><h3>OK, so how does it work?</h3><p>An RSS feed is just an XML-formatted document, and if you’ve ever published a blog post then you’ve almost certainly published an RSS feed along with it, whether you were aware of it or now. Below is the RSS feed for this very blog, which you can see for yourself at <a href="https://medium.com/feed/@krandles">https://medium.com/feed/@krandles</a>.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/827/1*rpT80Jj8zPCAp_jiUYK1xQ.jpeg" /></figure><p>Since I’m in the middle of writing my first post here, it doesn’t actually contain any articles yet, but you can see it already provides some basic information about my blog, and we’ll take a look at an actual post a bit later on.</p><p>First, let’s get into how we can use Ruby to scrape an RSS feed from a website and turn it into something we can work with.</p><h3>require ‘rss’</h3><p>A quick search of <a href="https://rubygems.org/search?utf8=%E2%9C%93&query=rss">rubygems.org</a> for “rss” returns a couple hundred results, but all we need to get started (along with open-uri, so Ruby can open webpages) is Ruby’s standard library, which provides the ability to produce and consume feeds via the <a href="http://ruby-doc.org/stdlib-2.4.2/libdoc/rss/rdoc/RSS.html">RSS module</a>:</p><pre>require 'rss'<br>require 'open-uri'</pre><p>Trying out the example code provided in the Ruby documentation shows off the basic functionality:</p><pre>url = 'https://medium.com/feed/@olegchursin/'</pre><pre>open(url) do |rss|<br> feed = RSS::Parser.parse(rss)<br> puts "Title: #{feed.channel.title}"<br> feed.items.each do |item|<br> puts "Item: #{item.title}"<br> end<br>end</pre><p>This parses the feed’s XML data into an object of class RSS::Rss and demonstrates calling #channel and #items to output the titles of the channel (Hi, Oleg!) and its recent articles:</p><pre>Title: Stories by Oleg Chursin on Medium<br>Item: A Brief Introduction to Domain Modeling<br>Item: Refactoring in Ruby</pre><p>Now that we have the feed in a format Ruby can easily use, we can call the #items method on it, which returns an array containing the feed’s articles as objects of class RSS::Rss::Channel::Item, then call each Item’s methods to access title, description, creator, etc., and a link to the full article.</p><p>All of this turned out to be much easier than I was expecting, and I wanted to play around with RSS a little more, so I started building some methods to interact with a feed, then created an RSSFeed class and a command line interface to navigate and display the feed’s contents and open full articles in a browser. A couple hours later I ended up with an RSS reader application. It’s still very basic, but I hope to build on it as I continue to expand my knowledge — next up is storing feeds, channels, and articles in a database, then perhaps displaying all of this on a webpage instead of in a terminal window.</p><p>If you’re curious about seeing how I did all this it’s up on Github at the link below, and you can expect to see more posts here related to this project as I continue to build it.</p><p><a href="https://github.com/krandles/really-simple-rss">krandles/really-simple-rss</a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=a32a8654733a" width="1" height="1" alt="">
Author
Link
Published date
Image url
Feed url
Guid
Hidden blurb
--- !ruby/object:Feedjira::Parser::RSSEntry published: 2017-12-21 03:20:48.000000000 Z carlessian_info: news_filer_version: 2 newspaper: Kevin Randles on Medium macro_region: Technology entry_id: !ruby/object:Feedjira::Parser::GloballyUniqueIdentifier is_perma_link: 'false' guid: https://medium.com/p/a32a8654733a title: RSS and Ruby — It’s Really Simple categories: - rss - ruby - flatiron-school content: '<h3>No, really, it is…</h3><p>I’ve been using an RSS aggregator as my primary method of consuming news for about 10 years, so now that I’m learning how to do more with Ruby than just execute blocks of code, I thought I’d dig into how I can use Ruby to interact with an RSS feed and see just what I can do with it. As it turns out, it’s really quite simple.</p><h3>What is RSS?</h3><p>RSS, created by a couple engineers at Netscape back in the days when Netscape was still a thing, originally stood for Rich Site Summary (thanks, <a href="https://en.wikipedia.org/wiki/RSS">Wikipedia</a>!) but is now known as Really Simple Syndication and it, along with the very similar Atom, is a standardized format for providing a feed of a website’s content, generally to be used by a feed aggregator, such as <a href="https://feedly.com/">Feedly</a> or the late, lamented Google Reader.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/262/1*vUkpL86j98HEOJaWSCJZLg.png" /><figcaption>There’s an xkcd for everything.</figcaption></figure><p>If you’re not sure why this would be useful, well…just try keeping up with the content of a few dozen pages without using an aggregator. Sure, you could follow them all on Twitter or Google Plus (LOL), as long as you’re cool with an algorithm deciding which posts actually show up in your feed, but a good RSS aggregator will give you total control over your newsfeed and ensure that you never miss a post, to say nothing of the ability to turn a Craigslist or eBay search into a custom RSS feed.</p><h3>OK, so how does it work?</h3><p>An RSS feed is just an XML-formatted document, and if you’ve ever published a blog post then you’ve almost certainly published an RSS feed along with it, whether you were aware of it or now. Below is the RSS feed for this very blog, which you can see for yourself at <a href="https://medium.com/feed/@krandles">https://medium.com/feed/@krandles</a>.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/827/1*rpT80Jj8zPCAp_jiUYK1xQ.jpeg" /></figure><p>Since I’m in the middle of writing my first post here, it doesn’t actually contain any articles yet, but you can see it already provides some basic information about my blog, and we’ll take a look at an actual post a bit later on.</p><p>First, let’s get into how we can use Ruby to scrape an RSS feed from a website and turn it into something we can work with.</p><h3>require ‘rss’</h3><p>A quick search of <a href="https://rubygems.org/search?utf8=%E2%9C%93&query=rss">rubygems.org</a> for “rss” returns a couple hundred results, but all we need to get started (along with open-uri, so Ruby can open webpages) is Ruby’s standard library, which provides the ability to produce and consume feeds via the <a href="http://ruby-doc.org/stdlib-2.4.2/libdoc/rss/rdoc/RSS.html">RSS module</a>:</p><pre>require 'rss'<br>require 'open-uri'</pre><p>Trying out the example code provided in the Ruby documentation shows off the basic functionality:</p><pre>url = 'https://medium.com/feed/@olegchursin/'</pre><pre>open(url) do |rss|<br> feed = RSS::Parser.parse(rss)<br> puts "Title: #{feed.channel.title}"<br> feed.items.each do |item|<br> puts "Item: #{item.title}"<br> end<br>end</pre><p>This parses the feed’s XML data into an object of class RSS::Rss and demonstrates calling #channel and #items to output the titles of the channel (Hi, Oleg!) and its recent articles:</p><pre>Title: Stories by Oleg Chursin on Medium<br>Item: A Brief Introduction to Domain Modeling<br>Item: Refactoring in Ruby</pre><p>Now that we have the feed in a format Ruby can easily use, we can call the #items method on it, which returns an array containing the feed’s articles as objects of class RSS::Rss::Channel::Item, then call each Item’s methods to access title, description, creator, etc., and a link to the full article.</p><p>All of this turned out to be much easier than I was expecting, and I wanted to play around with RSS a little more, so I started building some methods to interact with a feed, then created an RSSFeed class and a command line interface to navigate and display the feed’s contents and open full articles in a browser. A couple hours later I ended up with an RSS reader application. It’s still very basic, but I hope to build on it as I continue to expand my knowledge — next up is storing feeds, channels, and articles in a database, then perhaps displaying all of this on a webpage instead of in a terminal window.</p><p>If you’re curious about seeing how I did all this it’s up on Github at the link below, and you can expect to see more posts here related to this project as I continue to build it.</p><p><a href="https://github.com/krandles/really-simple-rss">krandles/really-simple-rss</a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=a32a8654733a" width="1" height="1" alt="">' rss_fields: - title - url - author - categories - published - entry_id - content url: https://medium.com/@krandles/rss-and-ruby-its-really-simple-a32a8654733a?source=rss-d451d084d34a------2 author: Kevin Randles
Language
Active
Ricc internal notes
Imported via /Users/ricc/git/gemini-news-crawler/webapp/db/seeds.d/import-feedjira.rb on 2024-04-03 16:31:16 +0200. Content is EMPTY here. Entried: title,url,author,categories,published,entry_id,content. TODO add Newspaper: filename = /Users/ricc/git/gemini-news-crawler/webapp/db/seeds.d/../../../crawler/out/feedjira/Technology/Kevin Randles on Medium/2017-12-21-RSS_and_Ruby_—_It’s_Really_Simple-v2.yaml
Ricc source
Show this article
Back to articles