Script to import/update the rss aggregation

This commit is contained in:
Stefan Schlott 2013-08-27 09:18:36 +02:00
parent 49702d5574
commit f6187e6ebe
7 changed files with 191 additions and 35 deletions

View file

@ -10,4 +10,5 @@ gem 'sass'
gem 'rdiscount'
gem 't'
gem 'nokogiri'
gem 'feedzirra'

View file

@ -2,6 +2,8 @@ GEM
remote: http://gems.github.com/
remote: http://rubygems.org/
specs:
activesupport (3.1.12)
multi_json (~> 1.0)
addressable (2.3.3)
adsf (1.0.1)
rack (>= 1.0.0)
@ -11,6 +13,7 @@ GEM
cookiejar (0.3.0)
cri (2.3.0)
colored (>= 1.2)
curb (0.7.18)
daemons (1.1.9)
em-http-request (1.0.3)
addressable (>= 2.2.3)
@ -28,6 +31,16 @@ GEM
faraday (0.8.5)
multipart-post (~> 1.1)
fastercsv (1.5.5)
feedzirra (0.1.3)
activesupport (~> 3.1.1)
builder (>= 2.1.2)
curb (~> 0.7.15)
i18n (>= 0.5.0)
loofah (~> 1.2.0)
nokogiri (>= 1.4.4)
rake (>= 0.8.7)
rdoc (~> 3.8)
sax-machine (~> 0.1.0)
ffi (1.9.0)
formatador (0.2.4)
geokit (1.6.5)
@ -43,12 +56,16 @@ GEM
nanoc (>= 3.6.3)
htmlentities (4.3.1)
http_parser.rb (0.5.3)
i18n (0.6.5)
json (1.8.0)
launchy (2.2.0)
addressable (~> 2.3)
listen (1.2.3)
rb-fsevent (>= 0.9.3)
rb-inotify (>= 0.9)
rb-kqueue (>= 0.2)
loofah (1.2.1)
nokogiri (>= 1.4.4)
lumberjack (1.0.4)
method_source (0.8.2)
mini_portile (0.5.1)
@ -67,14 +84,19 @@ GEM
method_source (~> 0.8)
slop (~> 3.4)
rack (1.4.5)
rake (10.1.0)
rb-fsevent (0.9.3)
rb-inotify (0.9.0)
ffi (>= 0.5.0)
rb-kqueue (0.2.0)
ffi (>= 0.5.0)
rdiscount (1.6.8)
rdoc (3.12.2)
json (~> 1.4)
retryable (1.3.2)
sass (3.2.10)
sax-machine (0.1.0)
nokogiri (> 0.0.0)
simple_oauth (0.2.0)
slop (3.4.6)
systemu (2.5.2)
@ -108,6 +130,7 @@ PLATFORMS
DEPENDENCIES
adsf
builder
feedzirra
guard-nanoc
nanoc3
nokogiri

View file

@ -1,11 +1,11 @@
blogroll:
-
user: skyr
atomfeed: https://stefan.ploing.de/atom.xml
user: Skyr
feed: https://stefan.ploing.de/atom.xml
-
user: princess
atomfeed: http://blog.querulantin.de/serendipity/index.php?/feeds/index.atom
feed: http://blog.querulantin.de/serendipity/index.php?/feeds/index.atom
-
user: Rince
atomfeed: http://blog.rince.de/feeds/index.atom
feed: http://blog.rince.de/feeds/index.rss2

9
content/planet-cccs.html Normal file
View file

@ -0,0 +1,9 @@
<% @item[:blogposts].each do |post| %>
<h2><a href="<%= post[:url] %>"><%= post[:title] %></a></h2>
<ul class="unstyled inline">
<li><i class="icon-calendar"></i> <time itemprop="dateCreated" datetime="<%= post[:date].strftime("%Y-%m-%d") %>"><%= post[:date].strftime("%d.%m.%Y") %></time></li>
<li><i class="icon-pencil"></i> <%= post[:user] %></li>
</ul>
<p>
<% end %>

View file

@ -1,31 +0,0 @@
-----
title: Planet CCCS
kind: page
-----
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy
eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam
voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet
clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit
amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam
nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat,
sed diam voluptua. At vero eos et accusam et justo duo dolores et ea
rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem
ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing
elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna
aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo
dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus
est Lorem ipsum dolor sit amet.
Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse
molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero
eros et accumsan et iusto odio dignissim qui blandit praesent luptatum
zzril delenit augue duis dolore te feugait nulla facilisi. Lorem ipsum
dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh
euismod tincidunt ut laoreet dolore magna aliquam erat volutpat.
Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper
suscipit lobortis nisl ut aliquip ex ea commodo consequat. Duis autem
vel eum iriure dolor in hendrerit in vulputate velit esse molestie
consequat, vel illum dolore eu feugiat nulla facilisis at vero eros et
accumsan et iusto odio dignissim qui blandit praesent luptatum zzril
delenit augue duis dolore te feugait nulla facilisi.

99
content/planet-cccs.yaml Normal file
View file

@ -0,0 +1,99 @@
---
title: Planet CCCS
kind: page
blogposts:
- user: princess
date: 2013-08-23 11:37:25.000000000 Z
title: !binary |-
TWFjaHQgZXVjaCBmcmVpLi4u
url: http://blog.querulantin.de/serendipity/index.php?/archives/355-Macht-euch-frei....html
- user: Skyr
date: 2013-08-18 15:39:52.000000000 Z
title: !binary |-
VGVsZWZvbmJhbmtpbmctRm5vcmQ=
url: http://stefan.ploing.de/2013-08-18-telefonbanking-fnord/
- user: Skyr
date: 2013-08-12 08:49:35.000000000 Z
title: !binary |-
TWVpbiBQcm9tb3Rpb25zc3plbmFyaW8gLS0gamV0enQgaW4gTG9uZG9u
url: http://stefan.ploing.de/2013-08-12-mein-promotionsszenario-in-london/
- user: princess
date: 2013-08-07 15:14:01.000000000 Z
title: Vom Tagesausflug zum Höllenritt
url: http://blog.querulantin.de/serendipity/index.php?/archives/354-Vom-Tagesausflug-zum-Hoellenritt.html
- user: princess
date: 2013-08-02 10:14:00.000000000 Z
title: !binary |-
MTMuIEluZm9ybWF0aWNhIEZlbWluYWxlIGFuIGRlciBIUyBGdXJ0d2FuZ2Vu
url: http://blog.querulantin.de/serendipity/index.php?/archives/353-13.-Informatica-Feminale-an-der-HS-Furtwangen.html
- user: Skyr
date: 2013-08-02 06:38:34.000000000 Z
title: Zwei git-Repositories zusammenführen
url: http://stefan.ploing.de/2013-08-02-zwei-git-repos-zusammenfuehren/
- user: Skyr
date: 2013-08-01 11:15:07.000000000 Z
title: !binary |-
UmFudDogSGFsbG8sIEJsYWNraGF0IT8=
url: http://stefan.ploing.de/2013-08-01-rant-hallo-blackhat/
- user: Skyr
date: 2013-07-28 19:09:48.000000000 Z
title: Überwachungsstaat - was ist das?
url: http://stefan.ploing.de/2013-07-28-ueberwachungsstaat-was-ist-das/
- user: Skyr
date: 2013-07-25 07:48:35.000000000 Z
title: !binary |-
VG9kby1MaXN0ZSBtaXQgdG9kby50eHQ=
url: http://stefan.ploing.de/2013-07-25-todo-liste-mit-todotxt/
- user: princess
date: 2013-07-24 20:35:00.000000000 Z
title: !binary |-
QXN5bGJld2VyYmVyIHVuZCBBcmJlaXQ=
url: http://blog.querulantin.de/serendipity/index.php?/archives/352-Asylbewerber-und-Arbeit.html
- user: princess
date: 2013-07-24 20:26:00.000000000 Z
title: Kultusministerium BaWü untersagt dienstlichen Einsatz von sozialen Netzwerken
an Schulen
url: http://blog.querulantin.de/serendipity/index.php?/archives/351-Kultusministerium-BaWue-untersagt-dienstlichen-Einsatz-von-sozialen-Netzwerken-an-Schulen.html
- user: princess
date: 2013-07-24 20:25:00.000000000 Z
title: Bewässerung und Landwirtschaft in Äthiopien
url: http://blog.querulantin.de/serendipity/index.php?/archives/350-Bewaesserung-und-Landwirtschaft-in-AEthiopien.html
- user: Skyr
date: 2013-07-20 12:16:14.000000000 Z
title: !binary |-
RGllIFNwZWljaGVyLU1hdHJqb3NjaGth
url: http://stefan.ploing.de/2013-07-20-speicher-matrjoschka/
- user: princess
date: 2013-07-16 21:29:00.000000000 Z
title: !binary |-
VG91ciBkdXJjaCBkaWUgUGZhbHo=
url: http://blog.querulantin.de/serendipity/index.php?/archives/347-Tour-durch-die-Pfalz.html
- user: princess
date: 2013-07-16 20:18:46.000000000 Z
title: !binary |-
RmFrdCBoZXV0ZSBhYmVuZA==
url: http://blog.querulantin.de/serendipity/index.php?/archives/349-Fakt-heute-abend.html
- user: princess
date: 2013-07-15 16:44:45.000000000 Z
title: !binary |-
TGllZGVyaGFsbGUgLSBiYWNrc3RhZ2UgdW5kIG9uc3RhZ2Ug
url: http://blog.querulantin.de/serendipity/index.php?/archives/348-Liederhalle-backstage-und-onstage.html
- user: Skyr
date: 2013-07-12 07:32:35.000000000 Z
title: Interessante Keynote über Scala
url: http://stefan.ploing.de/2013-07-12-interessante-keynote-ueber-scala/
- user: princess
date: 2013-06-17 16:24:09.000000000 Z
title: !binary |-
U2ViYXN0aWFuIEZpdHplazogIkRlciBOYWNodHdhbmRsZXIi
url: http://blog.querulantin.de/serendipity/index.php?/archives/346-Sebastian-Fitzek-Der-Nachtwandler.html
- user: princess
date: 2013-06-17 16:21:17.000000000 Z
title: !binary |-
UC5DLiBDYXN0OiAiQXVzZXJzZWhlbiI=
url: http://blog.querulantin.de/serendipity/index.php?/archives/345-P.C.-Cast-Ausersehen.html
- user: princess
date: 2013-06-11 19:40:00.000000000 Z
title: !binary |-
RWR3YXJkIFNub3dkZW4gaXN0IGVpbiBIZWxk
url: http://blog.querulantin.de/serendipity/index.php?/archives/344-Edward-Snowden-ist-ein-Held.html

55
scripts/update-planetfeeds.rb Executable file
View file

@ -0,0 +1,55 @@
#!/usr/bin/env ruby
# encoding: utf-8
require 'yaml'
require 'feedzirra'
def getBlogroll(blogroll_file)
blogroll_raw = YAML.load_file(blogroll_file)
blogroll = { }
blogroll_raw['blogroll'].each { |blog| blogroll[blog['feed']]=blog['user'] }
blogroll
end
blogroll_file = ARGV[0]
blogposts_file = ARGV[1]
# Read existing data
blogposts = if File.exists?(blogposts_file)
YAML.load_file(blogposts_file)
else
{ }
end
if !blogposts['blogposts']
blogposts['blogposts'] = []
end
blogroll = getBlogroll(blogroll_file)
# Build list for detecting duplicates
posturls = blogposts['blogposts'].map { |post| post['url'] }
# Read feed
feeds = Feedzirra::Feed.fetch_and_parse(blogroll.keys)
# Add feed data
feeds.each do |feed,data|
data.entries.each do |posting|
if !posturls.include?(posting.url)
postdata = { }
postdata['user'] = blogroll[feed]
postdata['date'] = posting.published
postdata['title'] = posting.title
postdata['url'] = posting.url
blogposts['blogposts'] << postdata
end
end
end
# Sort, limit list
blogposts['blogposts'].sort! { |a,b| b['date'] <=> a['date'] }
blogposts['blogposts'] = blogposts['blogposts'][0..19]
# Output
File.open(blogposts_file, 'w+') {|f| f.write(blogposts.to_yaml) }