Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

whats this markup language? ... line endings instead of closing tags

I'm attempting to parse a document that looks similar to this:

<PRESOL>
<DATE>1112
<YEAR>12
<AGENCY>Defense Logistics Agency
<OFFICE>DLA Acquisition Locations
<LOCATION>DLA Land and Maritime
<ZIP>43218-3990
<CLASSCOD>59
<DESC>Proposed procurement for NSN 5365013055528 SPACER,PLATE:
Line 0001 Qty 70.00  UI EA  Deliver To: ARIZONA INDUSTRIES FOR THE BLIND By: 0180 DAYS ADOThe solicitation is an RFQ and will be available at the link provided in this notice.  Hard copies of this solicitation are not available.  Digitized drawings and Military Specifications   and Standards may be retrieved, or ordered, electronically.
All responsible sources may submit a quote which, if timely received, shall be considered.
Quotes must be submitted electronically.

<SETASIDE>HUBZone
.......
</PRESOL>

As you can see its a bizarre, but perhaps it used to be some standard. The entire document appears to use a limited set of white space characters, for instance I see no [tab], however I do see line breaks within some of the larger data blocks.

Does this look familiar to anyone?

I'm looking for a rails gem that might parse this.

like image 534
LessQuesar Avatar asked Jan 16 '23 01:01

LessQuesar


2 Answers

It is a FederalBizOpps.gov Presolicitation Notice

Presolicitation Notice Template

The Presolicitation Template is used for the publication of notices for proposed acquisitions. FAR, Section 5.2 requires the submission of this document prior to the publication of any further actions. FBO will reject any other documents that refer to a specific solicitation without previous publication of the Presolicitation Notice for that solicitation.

 Tag           Description [Format]
 <PRESOL>                                                              
 <DATE>        Month and day synopsis is submitted [MMDD]
 <YEAR>        Year synopsis is submitted [YY]
 <CBAC>*       User ID for the Office Location. Assigned/managed by your location Administrator.    [string]
 <PASSWORD>*   Password. Assigned/managed by your location Administrator. [string]
 <ZIP>         The Contracting Office's ZIP code [5 Digits]
 <CLASSCOD>*   Either one alphabetic code or a two-digit code for service or supply that the synopsis should be listed under. [Valid classification code (FAR, Section 5.207(g))]
 <NAICS>*      Six-digit code for service or supply that the synopsis would be listed under [Valid NAICS Code]
 <OFFADD>      The complete address of the contracting office [Up to 65535 characters]
 <SUBJECT>     The classification code, two hyphens, and a brief title description of the synopsis. [Up to 255 characters]
 <SOLNBR>*     Unique reference number for the solicitation [Up to 128 characters from the set: a-z A-Z 0-9 - _ ( ) { }]
 <RESPDATE>    Response deadline date [MMDDYY]
 <ARCHDATE>    The date when this notice will be archived. [MMDDYYYY]
 <CONTACT>     The names and phone numbers of officials to contact in regard to this synopsis. If there are two points of contact, their information shall be separated by semicolon [Up to 65535 characters]
 <DESC>        A narrative description of the procurement action. [Up to 65535 characters]
 <LINK>        A structural tag [No data required or accepted]
 <URL>         The Government Agency's URL that will be listed with this award. [Up to 255 characters, consist of a restricted set of characters (see URL specification - RFC 2396)]
 <DESC>        Visible hypertext description provided to the user for linking to the related site [Up to 255 characters]
 <EMAIL>       A structural tag [No data required or accepted]
 <ADDRESS>     The Government Agency contact's email address [Up to 128 characters]
 <DESC>        Visible hypertext description provided for linking to the Government Agency contact's email [Up to 255 characters]
 <SETASIDE>    Identify set-aside acquisitions. [Valid values: 'Competitive 8(a)', 'Emerging Small Business', 'Woman Owned Small Business', 'Economically Disadvantaged Woman Owned Small Business', 'HUBZone', 'Partial HBCU / MI', 'Partial Small Business', 'Service-Disabled Veteran-Owned Small Business', 'Total HBCU / MI', 'Total Small Business', 'Veteran-Owned Small Business']
 <POPADDRESS>  Place of performance address [Up to 65535 characters]
 <POPZIP>      Place of performance ZIP code [Up to 5 digits]
 <POPCOUNTRY>  Place of performance country [Up to 32 characters]
 </PRESOL>     

Notes

  1. All red tags represent required data.
    • denotes validated data.
  2. <LINK>, <URL>, and <DESC> are a group data and should be provided or omitted together.
  3. <EMAIL>, <ADDRESS>, and <DESC> are a group data and should be provided or omitted together.

Example

<PRESOL> 
<DATE> 0521 
<YEAR> 99 
<CBAC> demo 
<PASSWORD> DEMO 
<ZIP> 22030 
<CLASSCOD> B 
<NAICS>123456 
<OFFADD> Office of Environmental Studies; 1323 Y Street, Washington, DC 22030 
<SUBJECT> B--ENERGY AND ENVIRONMENTAL SERVICES KNOWLEDGE DEVELOPMENT AND DISSEMINATION ACTIVITIES REGARDING THE HOMELESS MENTALLY ILL POPULATION 
<SOLNBR> 208-94-0008 
<RESPDATE> 061399 
<ARCHDATE> 07131999 
<CONTACT> Mary Ann Deal, Contract Specialist, 301-443-5329; Contracting Officer, Beatrice L. Woods, 301-443-0043 
<DESC> The Center for Mental Health Services is soliciting proposals on a full and open competitive basis from qualified organizations to award a 3-year contract to develop and disseminate new knowledge about effective approaches to providing comprehensive community-based services to persons with serious mental illnesses who are homeless. 
<LINK> 
<URL> http://www.abc.gov 
<DESC> Center for Mental Health <EMAIL> <ADDRESS> [email protected] 
<DESC> Center for Mental Health <SETASIDE> Total Small Disadvantage Business 
<POPADDRESS> Office of Environmental Studies; 1323 Y Street; Washington, DC 22030 
<POPZIP> 22030 
<POPCOUNTRY> US 
</PRESOL>
like image 53
Mads Hansen Avatar answered Mar 16 '23 08:03

Mads Hansen


(Understand that I haven't seen this before - this is all the result of some digging around)

This is the format for a Presolicitation Notice, as published by the United States' Federal Business Opportunities... something. This is one of fifteen data interchange formats defined by that organization.

I could find no description of the base format for that template. Which is unfortunate, because there are a ton of gotchas in SGML (as I mentioned in the comments, this sure looks a lot like SGML) that will bite you if you're not prepared for them. Here's an interesting example from Wikipedia: <QUOTE></QUOTE> can also be written as: <QUOTE//, or <QUOTE>.

The template documentation is limited to the format of the data expected in each field. For example:

<CLASSCOD>

Either one alphabetic code or a two-digit code for service or supply that the synopsis should be listed under. Valid classification code (FAR, Section 5.207(g))

like image 44
Michael Petrotta Avatar answered Mar 16 '23 09:03

Michael Petrotta