Internet Engineering Task Force | Erin. M. Phillips |
Internet-Draft | Oofoo, LLC. |
Intended status: Informational | July 15, 2013 |
Expires: January 16, 2014 |
The Universal Voting Markup Language (UVML)
draft-phillips-uvml-04
This document describes the Universal Voting Markup Language (UVML), a syntax for the structured representation of opinion in free text. Using UVML, opinions can be encoded in text, image, or video, and reliably interpreted by either human readers or automated-agents. UVML supports both rating and ranking semantics. Ratings may be scored using symbols associated with the five most commonly used opinion dimensions: quality(*), importance(!), outlook($), support-opposition(+), and likelihood(%). In addition, UVML defines a syntax for optionally including a demographic signature by which voters can publish basic demographic information with their UVML votes. The design of UVML leverages cross-cultural sentiment and voting-systems scholarship.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 16, 2014.
Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.
This document may not be modified, and derivative works of it may not be created, and it may not be published except as an Internet-Draft.
The Universal Voting Markup Language (UVML) enables the structured expression of opinions in text, images, or video. The motivation behind UVML is to be found in the above quote, and in "the long tail" of disenfranchised opinion found in society in general and social media in particular.
This document contains a UVML Voting System description (Section 3), a UVML Technical Specification (Section 4), a limited set of UVML Implementation Standards (Section 5), a UVML Rhetorical Guide (Section 6), and a UVML Functional Specification (Section 7). Some miscellaneous topics are addressed in Notes (Section 8).
The meanings of the following terms require clarification in this context. Terms quoted ("") in this section will not be quoted afterward.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
All UVML elements are defined using ABNF (Augmented Backus-Naur Form) notation. If you are not familiar with ABNF notation, please see [RFC5234] for a complete description. ABNF rule names are shown as verbatim.
The term "ballot" refers to something (text, image, video, or physical object) which includes one or more UVML votes.
The term "polling station" refers to a service that processes UVML ballots and responds to UVML queries.
The terms "target" and "topic" refers to the object of an opinion, as defined by FrameNet.
The term "contest" refers to a competition between candidates or options for which voters determine the outcome.
UVML is intended to enable a voting system with some deliberative qualities. Figure 1 shows the principle interactions between actors in the system.
_ O >--(1)-->--- UVML votes ------>-(1)-> / \ -|- | | | <--(2)--<--- voting results --< (2)-< \ / / \ | VOTER >------\ ---|--- | UVML(3) O | (4) votes \------> -|- >---(3)------> | | | | _v_ / \ / \ | \| UVML votes PROXY / \ | | on physical POLLING STATION |___| media
Figure 1
The following is a brief explanation of these interactions:
The UVML specification is shown below in ABNF notation. Additional details are provided in succeeding subsections. ABNF rules referenced below but not defined in this specification are either ABNF core rules or otherwise in the public domain.
Any document which contains UVML is a ballot. Ballots may contain free text, one or more votes, and optionally a demographic signature.
ABNF Rule | Definition |
---|---|
ballot | = [text] 1*(vote [text]) [signature] [text] |
vote | = rating / ranking |
rating | = HASH target score |
ranking | = 2HASH contest selections |
signature | = 3HASH profile |
; | |
;; rating | |
target | = name / this |
score | = [undecided] valuation |
name | = tag *( PERIOD tag ) |
this | = T H I S |
; | |
;; ranking | |
contest | = name |
selections | = 1*25selection |
selection | = *HWS [rank] HASH name [[undecided] valuation] |
rank | = DIGIT / "1" DIGIT / "2" ZEROTOFIVE |
; | |
;; score | |
undecided | = QUESTION |
valuation | = quality / importance / outlook |
valuation | =/ support-opposition / likelihood |
; | |
;; signature | |
profile | = [age [HYPHEN]] gender [jurisdiction] |
age | = 1*3DIGIT |
gender | = iso-5218-gender |
jurisdiction | = [country] [region] [area-code] |
country | = [HYPHEN] iso-3166-country |
region | = [HYPHEN] iso-3166-subdivision |
area-code | = [HYPHEN] 3*4DIGIT |
; | |
;; name | |
tag | = name-begin *70name-inner name-end |
name-begin | = LETTER / UPPER-ASCII / UNICODE |
name-inner | = name-begin / DIGIT |
name-end | = name-inner |
; | |
;; valuation | |
quality | = among-the-very-best / very-good |
quality | =/ good / fair / poor |
quality | =/ very-poor / among-the-very-worst |
; | |
importance | = highest-importance / very-important |
importance | =/ important / unimportant / irrelevant |
; | |
outlook | = never-more-optimistic / very-optimistic |
outlook | =/ optimistic / pessimistic |
outlook | =/ very-pessimistic / never-more-pessimistic |
; | |
support-opposition | = strongly-support / support |
support-opposition | =/ somewhat-support / somewhat-oppose |
support-opposition | =/ oppose / strongly-oppose |
; | |
likelihood | = definitely / very-likely / likely |
likelihood | =/ unlikely / very-unlikely / definitely-not |
; | |
;; ~~ quality ~~ | |
among-the-very-best | = 5*10STAR |
very-good | = 4STAR |
good | = 3STAR |
fair | = 2STAR |
poor | = 1STAR |
very-poor | = STAR MINUS |
among-the-very-worst | = STAR 2*9MINUS |
; | |
;; ~~ importance ~~ | |
highest-importance | = 3*10BANG |
very-important | = 2BANG |
important | = BANG |
unimportant | = BANG MINUS |
irrelevant | = BANG 2*9MINUS |
; | |
;; ~~ outlook ~~ | |
never-more-optimistic | = 3*10CURRENCY |
very-optimistic | = 2CURRENCY |
optimistic | = CURRENCY |
pessimistic | = CURRENCY MINUS |
very-pessimistic | = CURRENCY 2MINUS |
never-more-pessimistic | = CURRENCY 3*9MINUS |
CURRENCY | = DOLLAR / EURO / POUND / YUAN-YEN |
; | |
;; ~~ support ~~ | |
strongly-support | = 3*10PLUS |
support | = 2PLUS |
somewhat-support | = PLUS |
somewhat-oppose | = MINUS |
oppose | = 2MINUS |
strongly-oppose | = 3*10MINUS |
; | |
;; ~~ likelihood ~~ | |
definitely | = 3*10PERCENT |
very-likely | = 2PERCENT |
likely | = PERCENT |
unlikely | = PERCENT MINUS |
very-unlikely | = PERCENT 2MINUS |
definitely-not | = PERCENT 3*9MINUS |
; | |
;; ISO code sets | |
iso-5218-gender | = male / female |
iso-3166-country | = 2LETTER |
iso-3166-subdivision | = 1*2DIGIT / 2*3LETTER |
male | = M |
female | = F |
; | |
;; symbols | |
BANG | = %x21 |
HASH | = %x23 |
DOLLAR | = %x24 |
PERCENT | = %x25 |
AMPERSAND | = %x26 |
APOSTROPHE | = %x27 |
STAR | = %x2A |
PLUS | = %x2B |
MINUS | = %x2D |
HYPHEN | = %x2D |
PERIOD | = %x2E |
SLASH | = %x2F |
QUESTION | = %x3F |
UNDERSCORE | = %x5F |
EURO | = %x80 |
POUND | = %xA3 |
YUAN-YEN | = %xA5 |
M | = "M" / "m" |
F | = "F" / "f" |
T | = "T" / "t" |
H | = "H" / "h" |
I | = "I" / "i" |
S | = "S" / "s" |
; | |
;; symbol groups | |
LETTERDIGIT | = LETTER / DIGIT |
LETTER | = "A" / "B" / "C" / "D" / "E" / "F" / "G" |
LETTER | =/ "H" / "I" / "J" / "K" / "L" / "M" / "N" |
LETTER | =/ "O" / "P" / "Q" / "R" / "S" / "T" / "U" |
LETTER | =/ "V" / "W" / "X" / "Y" / "Z" |
LETTER | =/ "a" / "b" / "c" / "d" / "e" / "f" / "g" |
LETTER | =/ "h" / "i" / "j" / "k" / "l" / "m" / "n" |
LETTER | =/ "o" / "p" / "q" / "r" / "s" / "t" / "u" |
LETTER | =/ "v" / "w" / "x" / "y" / "z" |
ONETONINE | = "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" |
ZEROTOFIVE | = "0" / "1" / "2" / "3" / "4" / "5" |
UPPER-ASCII | = %xC0-FF |
UNICODE | = PLANE0 |
PLANE0 | = %x0100-D7FF / %xE000-FDCF |
PLANE0 | =/ %xFDF0-FFFD |
;; NOTE: java/scala lack support for PLANE1-2 | |
;; PLANE1 | = %x10000-1FFFD |
;; PLANE2 | = %x20000-2FFFD |
A UVML ballot consists of one or more UVML votes (rating or ranking) and optionally including a UVML voter demographic signature. Ballots can coexist with standard prose in any language.
A UVML rating vote begins with a hash (#), followed immediately by a target and score.
A UVML ranking vote begins with a double-hash (##), followed immediately by a contest list of selections in order preference: #first #second ... etc., up to 25 selections. Selections may be a simple hash tag, or a UVML rating vote (#first+++).
A UVML demographic signature begins with a triple-hash (###), followed by age, gender, and location (ideally, voting jurisdiction) of the voter.
Ex. Lorem ipsum dolor sit #topic*** amet, consectetur adipiscing elit. Nulla sed blandit felis. #topic++ Ut lacinia nunc ##contest1 #candidateA #candidateB #candidateC venenatis sapien #topic%-- adipiscing ##contest2 #candidateD+++ #candidateE+ ultrices. Nulla dictum tempus #topic$$$ volutpat. ###35F-USNY
UVML supports two types of votes: rating, and ranking. Rating votes represent a voters valuation of a specific target. Ranking votes represent a voters preferences among a number of options.
Ex (Opinion). #fluger**** ;; fluger quality is 'very good'
Ex (Choice). ##president.us #Smith #Jones #Johnson ;; voter preferences for president of the United States are Smith, then Jones, then Johnson.
Ex (Choice). Choice votes may also include Opinion votes, as in: ##president.us #Smith+++ #Jones+ #Johnson+ ;; preferences for president of the United States are Smith (strongly support), then Jones (somewhat support), then Johnson (somewhat support).
An UVML rating vote consists of a hash (#) followed immediately by a target and score. Optionally, a voter may qualify a rating as provisional, or, subject to change, by indicating that he or she is undecided.
A UVML target is the object of the voter sentiment, optionally suffixed by a taxonomy of subgroups from largest to smallest separated by a period (.). UVML also makes provision for a local reference, using the special target '#this'.
Ex. #target***
Ex. #drivers.subgroup.subsubgroup*****
Ex. #this!!! ;; a comment on the importance of whatever the comment is attached to
The score on a UVML rating vote consists of an optional undecided indicator followed by a valuation.
A voter may indicate that he or she is undecided by placing a question-mark (?) before the valuation. The connotation is that the voter is not completely convinced of his or her opinion, but the recorded opinion is the voter's current inclination.
Ex. #propXYZ?++ ;; unsure, but tending to support propXYZ
Ex. #target?$ ;; unsure, but tending to be optimistic
UVML supports five of the most commonly used opinion valuation dimensions ([Phillips2011]): quality, importance, outlook, support-opposition, and likelihood.
Quality: How good or bad someone thinks something is.
UVML defines quality as the level of "virtue" or "positive effects produced" or "positive connotation of".
Quality | Syntax |
---|---|
Among the Very Best | ***** |
Very Good | **** |
Good | *** |
Fair | ** |
Bad (or Poor) | * |
Very Bad (or Very Poor) | *- |
Among the Very Worst | *-- |
Importance: How significant or insignificant someone thinks something is compared to other items to be considered.
Importance | Syntax |
---|---|
Highest Importance | !!! |
Very Important | !! |
Important | ! |
Unimportant | !- |
Irrelevant | !-- |
Outlook: How expectant of good (optimistic) or bad (pessimistic) someone is about something.
Outlook | Syntax |
---|---|
Never More Optimistic | $$$ |
Very Optimistic | $$ |
Optimistic | $ |
Pessimistic | $- |
Very Pessimistic | $-- |
Never More Pessimistic | $--- |
Support-Opposition: How strongly someone's thinking is aligned with a topic.
Support | Syntax |
---|---|
Strongly Agree or Strongly Support | +++ |
Agree or Support | ++ |
Agree, somewhat | + |
Disagree, somewhat | - |
Disagree or Oppose | -- |
Strongly Disagree or Strongly Oppose | --- |
Likelihood: How certain someone is that something is going to happen.
Likelihood | Syntax |
---|---|
Definitely (100%) | %%% |
Very Likely (>90+%) | %% |
Likely (>75%) | % |
Unlikely (<25%) | %- |
Very Unlikely (<10%) | %-- |
Definitely Not (0%) | %--- |
UVML provides a means of emphasis to give voters freedom to emphasize the strength of their opinion, may be done by repeating the final symbol of the opinion. After all, three exclamation points are just not enough for some #topics!!!!!!
From a tabulation standpoint, these opinions do not carry more than the maximum consideration.
Ex. #life********* ;; life has never been better
Ex. #life$$$$$$ ;; never been more optimistic
A ranking consists of a contest, and one or more selections.
A UVML contest vote is a indicated by a double-hash (##), followed by the contest name. Each selection takes the form of a hash tag, optionally followed by a valuation. Voter selections are listed in order, beginning with the voter's first selection, optionally followed by second, then third, etc., up to 25 selections.
Ex. ##president.us #Smith #Jones #Johnson
Ex. ##president.us #Smith+++ #Jones++ #Johnson
From a contest tabulation standpoint---both of the above examples are tabulated the same. Each candidate receives a 1/CHOICE score for the ballot, where CHOICE represents that candidates ordinal position, in this case Smith +1.0, Jones +0.5, and Johnson +0.33.
The second example shows that UVML rating votes can be used as UVML contest selections. In such cases, the rating votes as tabulated as well, in this case Smith (strongly support) and Jones (support).
UVML enables voters to provide a demographic signature, to help polling stations do voter segmentation. A UVML signature is prefixed by triple-hashes (###). The voter signature may appear before, after, or between votes within a ballot.
A UVML demographic signature is composed of three parts: age, gender, and location. Only gender is required. The location may be a country, country-state, or country-areacode. The country is optional, but recommended in cases where the polling station may not reliably supply a correct default.
Ex. ###F ;; female voter
Ex. ###M ;; male voter
Ex. ###29F212 ;; 29 year old female voter from New York City
Ex. ###M-CN ;; male voter from China
Ex. ###75F-USCA ;; 75 year old female voter from California
Votes are tabulated at each level where the information is provided or can be derived. In the case of ###M212, votes would be counted for the 212 area code, the state of New York, and the country of the Unites States.
The following standards promote consistency in voter experience across polling stations.
For any UVML topic, contest, or selection which includes a subgroup delimiter (.), the polling station must roll up a specific topic to a more general one. For example, #drivers.ny**** must not imply #drivers****.
If multiple votes for the same topic and valuation type are included in a single ballot, then a polling station must count the first vote only.
Signatures may appear anywhere within a ballot, though the ABNF grammar for a ballot indicates that a signature follows after all votes.
If multiple demographic signatures are present in a single ballot, then a polling station must reference the first signature only.
Ballots which are obviously forwarded should not be counted.
UVML in video is treated as UVML in text. Whether on one frame or all frames, multiple UVML votes for the same topic and opinion type are not counted more than once.
A UVML vote can take two rhetorical forms: confirming and inline.
In the confirming form, UVML restates, clarifies, or quantifies an opinion expressed elsewhere in the text.
Ex. Came across this cool voting syntax called, UVML. #uvml*****.
In the inline form, UVML both embodies the full semantics of opinion and plays a grammatical role in the text.
Ex. Check out #uvml*****.
This section contains some more theoretical descriptions of the core motivations of the various components of UVML. Future changes to the UVML standard should be supported by these functional specifications.
Author's Note: While an effort has been made to inform UVML's design goals with concepts from elections systems theory, the field is highly complex and the treatment of related topics in this document should only be considered superficial.
The aim of UVML is to enable a high quality ubiquitous voting system capable of processing ballots authored in text, image, or video. The functional requirements for UVML are presented below. The framework for the below requirements benefits from [HospVora2010].
UVML must enable voters to express their opinions completely, clearly, and compactly, such that the voter can be confident that a vote cast using UVML accurately reflects his or her intentions.
UVML must enable voters to express their opinions using the principle elements of opinion semantics (Section 8.2) and rank-order preference semantics (Section 8.3).
UVML must enable voters to clearly communicate to readers the semantics of their opinions ( Completeness ) to both human and automated-agent readers.
The syntax for UVML votes and signatures must be defined using an ABNF grammar.
The syntax for UVML votes and signatures should be defined so as to minimize conflicts with established modes of textual or visual communications.
UVML syntax must include only the minimum number of symbols required to meet semantic requirements. UVML must reconcile conflicts between compactness and clarity in favor of compactness.
UVML syntax must be a universal syntax, requiring no or only minimal localization.
UVML must enable voters to sign their ballots with a demographic signature; however, "polling stations" must not require a signature for a ballot to be counted.
UVML must provide inference-free voting tabulation, such that the vote cast is the vote recorded.
UVML must enable voters to query voting results and sample ballots.
UVML syntax must be easy enough to process such that polling stations can render tabulated returns in near-real-time, so that voters can be assured that their votes and the votes of others are being reported as they are counted.
UVML must enable a voter to vote again if a previous vote was cast incorrectly or the voter's opinion changes.
UVML must enable a closed-loop voting system, where voters can query the full content of the ballots of other voters and re-vote as they learn more about the things that are important to themselves and other voters.
This section includes remarks which didn't seem to fit anywhere else.
Some work was done in the design of UVML to limit accidental voting. Accidental votes are votes which are valid UVML, but for which the author did not intend to vote using UVML---his or her text just happened to be valid UVML.
This author examined a blog corpus ([Spinn3r2009]) totaling 62M social media documents in . The accidental vote rate for unique targets was 33:1,000,000. The rate of accidental voting for meaningful targets is much lower, perhaps 3-5:1,000,000. Below are some examples of the false positives. Unicode characters in the below examples will appear as � in the ascii format of this specification.
Ex(s). #BWE08!, #ta!!!!!, #arriba-, #cbaaab+, ##BR#Showing, #top-, #user-, #Movies336-, #Caution!, #路况市区20%, #DSNYUK23789BNew!!, #er!, #YUWB7JLUQI%, #Ap+, #cutid1!, #leella!, ##BR#Great, #︿$*, #FNT*, #Увольте!, #g4+, #¥……—*, #FF6021!, #trendanalysis-, #输出heapdump*, #KSCS25FVSS-, #уй!, #EL+, #tch!!!, #fwe%, #seoad-, #outros240-, #GOBAMA!, #NR%, #данутых!, #osleoperd-, #而对于c++, #DANUNO-, ##BR#Only, #vouvotar!, ##BR#Jacquelyn.
The language distribution of these accidental votes was more or less consistent with each language's representation in the corpus. Portuguese was a little higher than expected for some reason, and Chinese/Japanese/Korean accidental voting rates were a little lower than expected: U -> 382, my -> 1, de -> 89, fr -> 182, pl -> 6, hr -> 9, uk -> 7, ta -> 4, hi -> 2, io -> 11, ka -> 3, ru -> 178, ko -> 3, th -> 6, cjk -> 489, ar -> 18, no -> 9, nl -> 17, et -> 7, pt -> 317, ro -> 4, hu -> 1, ja -> 122, vi -> 8, da -> 3, sv -> 19, he -> 21, fa -> 4, it -> 20, zh -> 31, es -> 123, en -> 3006.
UVML's current syntax seems to strike a useful balance between compactness, differentiation from conventional text, and expressiveness power.
UVML leverages the definition of opinion semantics provided by FrameNet: "topic", an "aspect", a "constancy" value, a "manner."
For Election-Contest semantics, UVML uses the definition of rank-order semantics provided by FrameNet. Specifically, "the {perceived} state of occupying a certain {place} within a hierarchy."
At this time, the specification of UVML does not include a hash-right convention for right-to-left languages. To date, social media content of users of right-to-left languages have not worked to implement such a convention. If this changes, the UVML specification may follow.
This document was constructed using the xml2rfc project.
This memo includes no request to IANA.
UVML does not pose a security threat.
[RFC2119] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. |
[RFC5234] | Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008. |
[HospVora2010] | Department of Computer Science, George Washington University, "An information-theoretic model of voting systems", October 2008. |
[Phillips2011] | , , "Multi-language Analysis of Sentiment Types in Blogs", May 2011. |
[Spinn3r2009] | Proceedings of the Third Annual Conference on Weblogs and Social Media (ICWSM 2009, "The icwsm 2009 spinn3r dataset", May 2009. |
The following basic regular expressions (from scala) express the above ABNF specification.
Basic regular expression for the capture of UVML rating votes.
(?:\x23(?:[A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd][0-9A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd]{0,70}[0-9A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd](?:\x2e[A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd][0-9A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd]{0,70}[0-9A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd])*|[Tt][Hh][Ii][Ss])\x3f?(?:\x2a{5,10}|\x2a{4}|\x2a{3}|\x2a{2}|\x2a|\x2a\x2d|\x2a\x2d{2,9}|!{3,10}|!{2}|!|!\x2d|!\x2d{2,9}|[\x24\x80\xa3\xa5]{3,10}|[\x24\x80\xa3\xa5]{2}|[\x24\x80\xa3\xa5]|[\x24\x80\xa3\xa5]\x2d|[\x24\x80\xa3\xa5]\x2d{2}|[\x24\x80\xa3\xa5]\x2d{3,9}|\x2b{3,10}|\x2b{2}|[\x2b\x2d]|\x2d{2}|\x2d{3,10}|%{3,10}|%{2}|%|%\x2d|%\x2d{2}|%\x2d{3,9}))(?=\Z|[.;:\s])
Basic regular expression for the capture of UVML ranking votes.
(?:\x23{2}[A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd][0-9A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd]{0,70}[0-9A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd](?:\x2e[A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd][0-9A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd]{0,70}[0-9A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd])*(?:[\t\x20]*(?:\d|1\d|2[0-5])?\x23[A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd][0-9A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd]{0,70}[0-9A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd](?:\x2e[A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd][0-9A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd]{0,70}[0-9A-Za-z\xc0-\ud7ff\ue000-\ufdcf\ufdf0-\ufffd])*(?:\x3f?(?:\x2a{5,10}|\x2a{4}|\x2a{3}|\x2a{2}|\x2a|\x2a\x2d|\x2a\x2d{2,9}|!{3,10}|!{2}|!|!\x2d|!\x2d{2,9}|[\x24\x80\xa3\xa5]{3,10}|[\x24\x80\xa3\xa5]{2}|[\x24\x80\xa3\xa5]|[\x24\x80\xa3\xa5]\x2d|[\x24\x80\xa3\xa5]\x2d{2}|[\x24\x80\xa3\xa5]\x2d{3,9}|\x2b{3,10}|\x2b{2}|[\x2b\x2d]|\x2d{2}|\x2d{3,10}|%{3,10}|%{2}|%|%\x2d|%\x2d{2}|%\x2d{3,9}))?){1,25})(?=\Z|[.;:\s])
Basic regular expression for the capture of UVML signatures.
(?:\x23{3}(?:\d{1,3}\x2d?)?[FMfm](?:(?:\x2d?[A-Za-z]{2})?(?:\x2d?(?:\d{1,2}|[A-Za-z]{2,3}))?(?:\x2d?\d{3,4})?)?)(?=\Z|[.;:\s])