SecurityAdvisories beat generate and marked COMPLETE

Leave a comment

I was able to generate and update the beat for “Security Advisories” today.

https://fedoraproject.org/wiki/FWN/Beats/SecurityAdvisories

The GUI is not yet developed therefore I setup required variables and run the script to generate the beat in wiki format. With a GUI this beat can be made fully automated. Following is the proposed GUI(web base) for generating the beat.

As noted the inputs are Start and End Dates, The output format can be selected.

With thin next couple of days I ll complete the GUI part and then this is capable of producing the beat 100% with no intervention.

Mailman .txt archive

Leave a comment

As I noted on the last post, my next goal is to use mailman .txt archive to extract required information.

Following is an extract of the .txt file. All messages are in one .txt file.

From bckurera at fedoraproject.org Sun Apr 1 15:23:17 2012
From: bckurera at fedoraproject.org (Buddhike Kurera)
Date: Sun, 1 Apr 2012 20:53:17 +0530
Subject: [Important Notice] Regarding students' application submission
Message-ID:


Dear Students,

We believe that your are enjoying the Fedora project and getting ready
to GSoC with Fedora.

The only way to identify a mail is it start with From: , the next challenge is getting the url with related the mail. There is no url noted in the .txt file. Therefore my plan is to parse the .txt file and get the required info extracted, parse the archive HTML page and get the info extracted. Finally link two info extracted. In that why it is possible to generate the summary of the list.

However if the concern is limited to the subject and the sender, it is easy to use the HTML archive page. If you want to consider the date of course you need to refer the .txt file.

The interesting thing is the link between mails is maintained using the message-ID. Most of the time if it is a new mail there is no In-Reply-To: and References: tag, which connect the reply with the original mail. Using this connection we can build the relationship between mails.

2 days after the proposal submited

Leave a comment

After submitting the proposal I was away to celebrate the Easter. Finally on Monday I started working towards the project.  Therefore yet noting much has done.

But I have parsed subjects from the Archive page and get them into an array. A simple PHP code written by me. I decided not to use a library to parse HTML in the Archive page. The reason is there are not much efficient with the requirements. So I think writing one would be easy and we can obtain high efficiency.

My plan is to extract data from the archive page parsing them, the final output would be an array with subject, name and link. On the archive page there is no data will be displayed about the date and time the email has been sent. To get those data we need to dig into the mail.

My mentor noted me that using the .txt file would be easy rather than parsing the archive page. But I ll write the simple code first and generate the array. T

Hopefully I ll finish it today and will move to parsing the .txt file on Tuesday.

Following is the var_dump of the array, I got with the subject only.


array(36) { [0]=> string(65) "[Important Notice] Regarding students application submission
" [1]=> string(27) "[Proposal Need Review]
" [2]=> string(16) "my proposal
" [3]=> string(67) "GSoC 2012 - Idea Page is freeze - 27 Ideas, under 8 categories
" [4]=> string(14) "GSOC Help
" [5]=> string(60) "Registered mentors join with summer-coding-discuss list
" [6]=> string(79) "GSoC : Integrate Proxy Settings and Network Connections(Locations) project
" [7]=> string(21) "Project Proposal
" [8]=> string(27) "GSoC: Insight Projects
" [9]=> string(84) "Fwd: GSoC : Integrate Proxy Settings and Network Connections(Locations) project
" [10]=> string(21) "updated proposal
" [11]=> string(41) "[GSoC Proposal] Java API/ABI Checker
" [12]=> string(39) "Proposal: Java API changes checker
" [13]=> string(42) "Mentors - start commenting in Melange
" [14]=> string(25) "about google melange
" [15]=> string(46) "Proposal : Insight Use Cases for Calendar
" [16]=> string(45) "Comments/ Feedback on Students proposals
" [17]=> string(43) "Fedora JBoss spin proposal application
" [18]=> string(50) "GSoC Proposal: Fedora On-Demand Build Service
" [19]=> string(73) "Public Comment for students - Proposal submissions on google-melange
" [20]=> string(26) "GSoC proposal: Dorrie
" [21]=> string(42) "GSOC 2012 Proposal: JBoss Fedora Spin
" [22]=> string(36) "GSOC intoduction of my proposal
" [23]=> string(39) "web hosting control panel project.
" [24]=> string(89) "GSoC proposal on Integrate Proxy Settings and Network Connections(Locations) by Thom
" [25]=> string(47) "GSoC2012: Proposal On OnDemandBuildService
" [26]=> string(47) "Student proposal submission - one day left
" [27]=> string(39) "New submission of proposal in GSoC
" [28]=> string(43) "summer-coding Digest, Vol 12, Issue 11
" [29]=> string(63) "Timelines [was:Re: summer-coding Digest, Vol 12, Issue 11]
" [30]=> string(9) "GSOC
" [31]=> string(51) "Students proposal submission has been closed !
" [32]=> string(25) "Submission wiki link
" [33]=> string(46) "Proposal submission is over, what is next
" [34]=> string(41) "Gsoc 2012 Fedora Audio Spin Proposal
" [35]=> string(43) "Fedora Audio Spin - GSoC 2012 Proposal
" }

GSoC Proposal – Semi Automation FWN

Leave a comment

This is the proposal submitted to my GSoC project, Semi-automation of FWN in Fedora

Proposal Description

Please describe your proposal in detail. Include:

An overview of your proposal

The proposed tool will assist FWN editors to create FWN issues with few clicks. My proposal contains two phases. In the first phase which is planned to finish before the mid-evaluation will cater the current need. Then at the next phase the improvements will be done with some new services. The tool is web based, PHP will be used as the scripting language. An interface will hold things together and every section contains a wizard, by following the wizard it is possible to get the content of the FWN and copy paste the output content to drupal module will be the editors task.

The need you believe it fulfills

Currently the FWN releases are quite cumbersome and due to busy schedules of the writers the FWN does not issue regularly and not having a complete, unique structured format. This tool will give some life to FWN issues and solve the problem. Then FWN issue will happen with no effort and the manpower requirement will get reduced. According to the mentor for this idea; FWN is a main requirement for the fedora project therefore the implementation of this idea seems critical.

Any relevant experience you have

I have worked for a project in my university which is about parsing and acquiring data from web page and producing an xml. Therefore I am familiar with some parsing techniques, algorithms. I am good at web application development using PHP, Python, HTML and AJAX.

How do you intend to implement your proposal

The main goal is to make FWN process more smoother by automating the FWN issuing process.This a web based application which can be accessed via an interface.

The FWN contains some sections as Issue 288 they are;

  1. Announcements
  2. Fedora In the News
  3. Ambassadors
  4. Security Advisories

Announcements

The following lists are parse and list the mails against the following topics;

Announcement - http://lists.fedoraproject.org/pipermail/announce/ is used to grab content. The content is listed extracting the content of the mails as required.

Fedora Development News - http://lists.fedoraproject.org/pipermail/devel-announce/ is used to grab content. The content is listed extracting the content of the mails as required.

To compose Upcoming Fedora Events, http://fedoraproject.org/wiki/Events content is used and parsed against the specific date range.

Fedora In the News

The marketing list (http://lists.fedoraproject.org/pipermail/marketing) is used to gather content. Mails with a link to outside source will listed with the first 3-4 sentences from the article. Direct link to the article is provided with the reporters name. However the detected list should be confirmed by the writer, with checking/ unchecking. That eliminates normal discussion entering into the list. Then the XML is created and that will be the input to compose the FWN.

Ambassadors

This section is composed according to ambassadors mailing list (http://lists.fedoraproject.org/pipermail/ambassadors/).Fedora Ambassadors Regional Meeting Minutes -http://fedoraproject.org/wiki/Ambassadors/Meetings#2012 entries will be parsed and obtained list of meetings and their URLs to meeting pages. Output will be an xml to FWN compose module to build the code required drupal module.

FAMSCO Meetings – The mailing list is parsed and search for ‘FAMSCO Meeting Minutes’ and grab the logs of the meetings.Summary of traffic on Ambassadors mailing list– The ambassador list is parsed for the given time period and obtain their subjects with the author, the created XML will be an input to FWN composing module.

Welcome New Ambassadors – The ambassadors mailing list is parse and search for subjects starting with ‘Fedora Ambassadors Welcome Week’. The content of those mails are parsed and extract the name of the new ambassador and the mentor. Again the XML is directed to FWN compose module.Events reported on Ambassadors mailing list – Mails on the same mailing list tagged with [Event Report] will be parsed and the subject and author info use to create the XML which will be the input for FWN composing module.

Campus Ambassadors mailing list-Summary of traffic – The campus ambassadors mailing list (http://lists.fedoraproject.org/pipermail/campus-ambassadors/) topics are parsed and listed.

Security Advisories

The package-announce mailing list(http://lists.fedoraproject.org/pipermail/package-announce) is used to write up this section. The module related with this section will search [SECURITY] tagged mails in the list with in the specific period (given week) against the specific release. It is enough to parse the subjects set of the mailing list archive.

The output will be an XML message that can be sent to FWN composer module and that generates the required code to the drupal module.Finally the FWN composing module will out put the code that should be entered to the drupal module. The web interface will hold every section and function related with this tool.Final deliverable of the proposal at the end of the periodThe main deliverable will the tool to compose FWN, a web application, packaged to RPM.

Final deliverable of the proposal at the end of the period

The main deliverable will the tool to compose FWN, a web application, packaged to RPM.

A rough timeline for your progress

April 7- April 23: Design a general template for an FWN issue.

April 23 – May 21: Finish the tool partially so that it generates a rough FWN issue.

May 21 – June 21 : Fine tune the tool and design GUI wizard to interact and add editor commands to generate the FWN.

June 21 – July 9 : Finishing and make sure the deliverable for mid evaluation is available.

Mid-Term Evaluation: Above mentioned deliverable should be available (a working tool that can be used to issue FWN).

July 13- July 20 : Based on the xml input, the required Drupal code will be generated. Add some other beats as noted onhttp://fedoraproject.org/wiki/FWN/Beats#Beats_Sections.

July 20 – August 13 : Create an interface with AJAX support to explore the mailman archives whenever required by giving the name and the time range.

August 13- August 20 : Package the tool and submit the tool.

Any other details you feel we should consider

Not as a deliverable at the end of the project I would like to extend some functionality of the tool developed, with an AJAX interface to explore the mailman archives easier. This is not a promise but an idea. However after the GSOC program I ll definitely finish this in my free time.

Have you communicated with a potential mentor? If so, who?

I have contacted the mentor and had a short IRC chat with Kevin Fenzi and Toshio Kuratomi.

Follow

Get every new post delivered to your Inbox.