idnits 2.17.00 (12 Aug 2021) /tmp/idnits37867/draft-housley-sow-author-statistics-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The "Author's Address" (or "Authors' Addresses") section title is misspelled. -- The document date (21 July 2015) is 2496 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT R. Housley 3 Intended Status: Informational Vigil Security 4 Expires: 21 January 2016 21 July 2015 6 Statement of Work for Extensions to the IETF Datatracker for 7 Author Statistics 9 draft-housley-sow-author-statistics-00 11 Abstract 13 This is the Statement of Work (SOW) for extensions to the IETF 14 Datatracker to provide statistics about RFCs and Internet-Drafts and 15 their authors. 17 Status of this Memo 19 This Internet-Draft is submitted to IETF in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF), its areas, and its working groups. Note that 24 other groups may also distribute working documents as 25 Internet-Drafts. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 The list of current Internet-Drafts can be accessed at 33 http://www.ietf.org/1id-abstracts.html 35 The list of Internet-Draft Shadow Directories can be accessed at 36 http://www.ietf.org/shadow.html 38 Copyright and License Notice 40 Copyright (c) 2015 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. Code Components extracted from this document must 49 include Simplified BSD License text as described in Section 4.e of 50 the Trust Legal Provisions and are provided without warranty as 51 described in the Simplified BSD License. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 56 2. Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 57 3. Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 3 58 3.1. Documents . . . . . . . . . . . . . . . . . . . . . . . . 3 59 3.2. Authors . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 3.3. Affiliation of Authors . . . . . . . . . . . . . . . . . . 4 61 3.4. Countries of Authors . . . . . . . . . . . . . . . . . . . 5 62 3.5. Continents of Authors . . . . . . . . . . . . . . . . . . 5 63 4. IETF Meeting Attendees . . . . . . . . . . . . . . . . . . . . 5 64 4.1. Countries of IETF Meeting Attendees . . . . . . . . . . . 6 65 4.2. Continents of IETF Meeting Attendees . . . . . . . . . . . 6 66 5. Existing Code . . . . . . . . . . . . . . . . . . . . . . . . 6 67 6. Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . 7 68 7. Security Considerations . . . . . . . . . . . . . . . . . . . 7 69 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 70 Author Address . . . . . . . . . . . . . . . . . . . . . . . . . . 8 72 1. Introduction 74 A prominent member of the IETF community has developed a set of tools 75 to produce statistics about the authors of RFCs and Internet-Drafts. 76 These tools analyze the documents themselves to produce statistics 77 about the documents and their authors. The goal of the IETF 78 Datatracker enhancements described in in this document is to provide 79 similar statistics and ensure that the software is maintained as part 80 of the IETF information services. While some data may still need to 81 be extracted from the documents themselves, as much data as possible 82 should come from the IETF Datatracker database. 84 Current statistics are available on the web at 85 http://www.arkko.com/tools/docstats.html. 87 The code that is used to produce these statistics is available at 88 http://www.arkko.com/tools/authorstats.html. 90 2. Purpose 92 Author statistics allow the community to understand where work is 93 being done and by whom. The statistics make it visible which 94 individuals, companies, and geographic regions are the most active 95 contributors. The statistics also show how these are changing over 96 the years. 98 Some of the statistics provide "nice to know" information; however, 99 others are sometimes used to refer to a particular participant's 100 contributions in the IETF or used to study trends within IETF work. 101 For instance, the IETF has been trying to increase the diversity of 102 participants, and the statistics are one way to see the impact of 103 those efforts. Also, the most active individuals are potential 104 candidates for various leadership positions. 106 3. Statistics 108 The enhancements to the IETF Datatracker shall provide statistics and 109 graphs about documents, document authors, author affiliation, author 110 country, and author continent. 112 The statistics should also include trends relating to IETF meeting 113 attendees, which the current tools do not track. 115 For the purposes of these requirements, "recent Internet-Drafts" and 116 "recent RFCs" cover documents that have been published in the last 117 five years. 119 3.1. Documents 121 The statistics shall provide insight into the number of authors per 122 document. The current web page presents the statistics and a bar 123 chart. The current web page can be seen at 124 http://www.arkko.com/tools/rfcstats/authdistr.html. 126 The statistics shall provide insight into the size of the documents. 127 The current web page presents the statistics and a bar chart. The 128 current web page can be seen at 129 http://www.arkko.com/tools/allstats/pagedistr.html. With the planned 130 change in document format, some other way to measure document size 131 might be more appropriate, such as word count. 133 Additionally, statistics about the document format that was used by 134 the authors should be provided, which is not provided by the current 135 tools. 137 The statistics shall provide insight into the use of various 138 specification techniques such as ABNF, ASN.1, C code, CBOR, JSON, and 139 XML. The current web page does not include all of these techniques. 140 The current web page can be seen at 141 http://www.arkko.com/tools/allstats/formatdistr.html. 143 3.2. Authors 145 The statistics shall provide insight into the distribution of authors 146 according to the number of documents they have authored for recent 147 Internet-Drafts, recent RFCs, and all RFCs. The current web pages 148 that provide similar information include the statistics and a bar 149 chart, and the web pages are available at 150 http://www.arkko.com/tools/stats/authactdistr.html, 151 http://www.arkko.com/tools/recrfcstats/authactdistr.html, and 152 http://www.arkko.com/tools/rfcstats/authactdistr.html. 154 The statistics shall provide insight into the distribution of authors 155 according to the number of documents they have authored for recent 156 Internet-Drafts, recent RFCs, and all RFCs. 158 The statistics shall provide insight into the relative impact of 159 authors by the number of their RFCs that are cited by other RFCs. 160 The current web page can be seen at 161 http://www.arkko.com/tools/rfcstats/hindextop.html. 163 3.3. Affiliation of Authors 165 The statistics shall provide insight into the affiliation of authors 166 for recent Internet-Drafts, recent RFCs, and all RFCs. The current 167 web pages that provide similar information include the statistics and 168 a bar chart, and the web pages are available at 169 http://www.arkko.com/tools/allstats/companies.html, 170 http://www.arkko.com/tools/stats/companydistr.html, 171 http://www.arkko.com/tools/recrfcstats/companydistr.html, and 172 http://www.arkko.com/tools/rfcstats/companydistr.html. 174 The statistics shall provide insight into the way that affiliation of 175 RFC authors has changed over the years. The current web page can be 176 seen at http://www.arkko.com/tools/rfcstats/companydistrhist.html. 178 3.4. Countries of Authors 180 The statistics shall provide insight into countries of authors for 181 recent Internet-Drafts, recent RFCs, and all RFCs. It has been 182 useful provide country-based statistics, and it has also been useful 183 to provide statistics showing the European Union (EU) as a single 184 "country" for the sake of comparison with other large countries. The 185 current web pages that provide similar information include the 186 statistics and a bar chart, and the web pages are available at 187 http://www.arkko.com/tools/rfcstats/countries.html, 188 http://www.arkko.com/tools/stats/d-countrydistr.html, 189 http://www.arkko.com/tools/stats/d-countryeudistr.html, 190 http://www.arkko.com/tools/stats/countrydistr.html, 191 http://www.arkko.com/tools/stats/countryeudistr.html, 192 http://www.arkko.com/tools/recrfcstats/d-countrydistr.html, 193 http://www.arkko.com/tools/recrfcstats/d-countryeudistr.html, 194 http://www.arkko.com/tools/recrfcstats/countrydistr.html, 195 http://www.arkko.com/tools/recrfcstats/countryeudistr.html, 196 http://www.arkko.com/tools/rfcstats/d-countrydistr.html, 197 http://www.arkko.com/tools/rfcstats/d-countryeudistr.html, 198 http://www.arkko.com/tools/rfcstats/countrydistr.html, and 199 http://www.arkko.com/tools/rfcstats/countryeudistr.html. 201 The statistics shall provide insight into the way that countries of 202 RFC authors has changed over the years. The current web page can be 203 seen at http://www.arkko.com/tools/rfcstats/countrydistrhist.html. 205 3.5. Continents of Authors 207 The statistics shall provide insight into continents of authors for 208 recent Internet-Drafts, recent RFCs, and all RFCs. The current web 209 pages that provide similar information include the statistics and a 210 bar chart, and the web pages are available at 211 http://www.arkko.com/tools/stats/d-contdistr.html, 212 http://www.arkko.com/tools/recrfcstats/d-contdistr.html, and 213 http://www.arkko.com/tools/rfcstats/d-contdistr.html. 215 The statistics shall provide insight into the way that continents of 216 RFC authors has changed over the years. The current pages can be 217 seen at http://www.arkko.com/tools/rfcstats/d-contdistrhist.html. 219 4. IETF Meeting Attendees 221 The enhancements to the IETF Datatracker shall provide statistics and 222 graphs about country and continent of IETF meeting participants. 224 4.1. Countries of IETF Meeting Attendees 226 The statistics shall provide insight into countries of IETF meeting 227 attendees for each meeting. Country-based statistics have been 228 presented in the plenary session for many years. For consistency 229 with the author statistics discussed in Section 3 of this document, 230 the statistics will include a way of showing the EU as a single 231 "country" for the sake of comparison with other large countries. The 232 statistics for each meeting should be accompanied with a pie chart 233 that shows the top eight countries and "other". 235 The statistics shall provide insight into the way that the countries 236 of IETF meeting attendees has changed over the years. Again, for 237 consistency with the author statistics discussed in Section 3 of this 238 document, the statistics will include a way of showing the EU as a 239 single "country". 241 4.2. Continents of IETF Meeting Attendees 243 The statistics shall provide insight into continents of IETF meeting 244 attendees for each meeting. 246 The statistics shall provide insight into the way that the continents 247 of IETF meeting attendees has changed over the years. 249 5. Existing Code 251 Since the new code will be driven by the Datatracker database to the 252 greatest extent possible, the existing code may be of limited value. 253 The existing code was also intended as a temporary solution and 254 requires a rewrite. However, a set of heuristics used by the code 255 may be useful. These heuristics are provided in a separate rule 256 database, and are used as a last resort when there is otherwise too 257 little information. The heuristics include author aliases, some 258 recognized authors and some recognized affiliations, domain name data 259 for determining location and affiliation, and mappings for some ways 260 that people represent their countries in a post address. 262 Authors are not consistent about the way their name appears in 263 various document. For example, one document may include their given 264 name and another document may include a nickname. The Datatracker 265 database provides a way to capture aliases, but not all of the 266 aliases in the documents have been added to the database. 268 The current Datatracker database does not have tables for heuristics 269 other than author aliases that are used in the current tool. 270 Appropriate tables to hold the additional heuristics from the current 271 rule database should be added to the Datatracker database in a manner 272 agreed by the group of people that maintain the Datatracker source 273 code. 275 A workable web interface, possibly using Django Admin, to update the 276 new heuristics tables shall be provided. 278 The current code can be found at 279 www.arkko.com/tools/authorstats.html, and is openly available but 280 without any warranty. 282 The software is split in two parts, with the code itself being 283 separate from the heuristics database. The two main components of 284 the code are authorstats, which produces the statistics and generates 285 the statistics web pages, and getauthors, which performs document 286 analysis. 288 6. Deployment 290 The current tools analyzes the documents themselves to produce 291 statistics. Some of the data needed to produce the statistics is not 292 currently in the Datatracker database. This development effort will 293 include adding the capability to capture this data in the Datatracker 294 database, and populate it for all RFCs and the Internet-Drafts posted 295 over the last five years. It may be cost-effective to leverage the 296 existing code to extract the information and then verify it one time. 298 The URLs for the current tools exist in many places in the Web. Once 299 a suitable replacement tool is available, the author of the original 300 tools has promised to provide a suitable form of redirection. 302 7. Security Considerations 304 This document contains the statement of work (SOW) for enhancements 305 to the IETF Datatracker to provide author statistics. These 306 enhancements do not affect the security of the Internet. The 307 enhancements provide statistics about documents that are available to 308 the public without prior authentication, and the statistics will also 309 be available to the public without prior authentication. 311 8. IANA Considerations 313 No changes to the IANA registries are suggested by this document. 315 Author Address 317 Russ Housley 318 Vigil Security, LLC 319 918 Spring Knoll Drive 320 Herndon, VA 20170 321 USA 323 Email: housley@vigilsec.com