tagging_prototype

<< back

Contents


Tagging Prototype

As an initial entry in the "Tagging Prototype" wiki link (I don't claim to be wiki editing knowledgable) from Tim Rue


Introduction

"Tagging" as dictionary defined: To label, identify, or recognize with or as if with a tag.

"Tag" in Computer Science:
a) A label assigned to identify data in memory.
b) A sequence of characters in a markup language used to provide information, such as formatting, specifications, about a document.

Tags and Tagging are meaningless by themselves, but becomes useful when associated to some meaning. In prior art tagging, tags help to identify,
organize and search prior art related information. The value of tags, is what they tag, and how we use what they tag, from concepts to documentation to code.


The challange of Tagging:

Due to the versatile abstract nature of software and the abstract concept of tagging, or more
directly, the ability of the human mind to create such layers of abstraction, any "tagging"
rule-sets we create, we can break. Breaking of such rule sets does not disallow the prior art
that breaks it, for that would be creating a false limitation on creativity and innovation for
the sake of tagging. Tagging is at best secondary to the art it is tagging.

With this in mind, it is the reason for tagging that is important, where the objective,
regarding the USPTO, is simply to make the artwork reasonably "findable" in searching for it.
For artwork that is not accessible, it is the exception to anything we can come up with
in this prior art "tagging" effort.

"Findable" artwork can vary in how it is accessible. Addressing whether or not the artwork can
be directly altered for added tagging (altering its date/time stamp), only pointed to (leaving
artwork date/time stamps untouched), or a dual version (tagged version pointing to original), requires
translations or post art processing for reaching the prior art search objective. Such post art
processing itself can vary as much as... well, as much as we can. There has already been
mentioned on the mailing list, several post processing applications for such artwork, internet
search engines, even more not mentioned, and reference to numerious long existing text processing
applications such as grep and awk.

To add to the challange, there is the "where" to look. Besides the patent databases, and the
open software repositories, there are other sources, on and off the internet. A lot of art on the
internet can change, drop from site (but perhaps backed up), change url locations, be duplicated or
mirrored, be temporarly unavailable, etc..

Then there is the USPTO classification system to take into account, and this classification
system can change by the hands of the USPTO themselves.

If there is any constant in all this, it is of "change" and potential for breaking of rules.
Success is on the search and translation or post processing machinery much more then it is upon
expecting programmers to tag their work, old and new, to accomodate any patent office prior art
search needs. The lack of application/code documentation is enough of a problem with out adding
tagging onto that coders list of "things to do".

This puts the "sum" of "prior art search" database maintaining on the shoulders of the patent
offices. The best we might do to help them do their job is to present our prior art in a form
that can be useful to them. Should there be a change in what we present or how they classify it,
then it is on them to maintain a record of what was previously presented/accessed and how it was
classified (including prior art that was not patented). A simple matter of their own ability to
determine certainty of date time stamps from their own maintained, and hopefully secured, inhouse
resources. Though if they have never accessed a resource we have persented them with, then it
would not be expected to find that resource represented in their database. And for those
resources they have accessed, how up to date it is, would minimally be dependant on their access
history of that resource (if they have accessed it once, then the part they have accessed,
should be internalized.)

In other words: It should be the process of searching outside resources by which internal databases
are at least created and updated in a dated journalized manner. This is not to say that some sort of
automation of external resoyurces for internal database maintainance and updating is not applied.
The patent examiner search process should, of course, be internal first (where it should be faster),
then external but in avoiding duplication of searching what is the same as the internal search.


Date/time stamping

Date/time stamping is another subject matter, relative to tagging in only that the act of tagging
doesn't alter prior art date/time stamps.

Above are the challenges (or some) of developing a tagging system. It may seem overwelming to some,
as to creating a tagging prototype. But if tagging is seen in terms of prior art placement(s) or
positioning(s) in a dual mapping and navigation of this map or "navigational mapping" of prior art
databases, then it become much easier to comprehend a solution direction.

Prior-Art batch >to> specified search mechanism w/optional tagging translations >to> prior art map(s).

Maps which would then gets integrated into a master map held secure, inhouse of the USPTO, either open to
no one outside, or all outside, but no inbetween as a matter of fairness. Note: this does not prevent
any outside parties from developing their own "master maps."


USPTO

In 1999 the USPTO issued an RFC Notice of Public Hearing and Request for Comments on Issues Related to the Identification of Prior Art During the Examination of a Patent Application

One of the responces was given for two reasons. One being that I saw it as a genuine solution direction,
and second, knowing the USPTO themselves would publish it, as another facet supporting the project NOT
being patentable.

My perception of it being a solution direction has not changed, though the objective of this software
prior art effort and "tagging proto-type" does seem to present me with the next stepping stone in moving
the project forward, perhaps into some use, but at least minimally to show I have not abandon it.


Prototype

The project was briefly introduced (with working examples) on the prior art mailing list: link

Requires python.

get and make accessible to python:
http://threeseas.net/vic/IQ-ID/iq.py

while online, enter on your command prompt/shell:

python iq.py -k http://threeseas.net/vic/IQ-ID/knmvic.iq . .

Note: the ". ." are wild cards. 

what the results are, is a list of keys.

now try:

python iq.py http://threeseas.net/vic/IQ-ID/knmvic.iq The .

What it outputs is the contents of of all word ":" that start with 
"The" --- ":The" in the file. And "." wildcard for
all.. sub-definitions of the word finding.

The "filekey : :: :::" that you see is like a legend for a map or blueprint.

the ": :: :::" part can be any character sequence, as we all know its important
to address exceptions when dealing with
matters of mental fabrications where breaking rules is common place.

The positions can be seen as: 

: = word and general definition

:: = sub definition parts

::: = also see reference.

The file doesn't need to have the filekey but needs to have the filekey character
assignment as the last used filekey set. This can be accomplished by creating a
small file that sets the key and then points to the target. Keep in mind
simpler interfaces to this is not the subject here.... we are starting simple.

you could remove the filekey from the file used in the above example and set it 
directly or indirectly to

": # :::" to see a focus in on the reference in that document

or to ": + :::" to focus in on the bullets in that document.

Its very common for us to organize or map the subject matter (software and text)
with three such primary levels. Unix man pages for example have their general 
description, followed by sub-description parts, examples, etc. and then
finishing with "also see".... 

ie:

python iq.py -k http://threeseas.net/vic/IQ-ID/thor-arexx.iq . .

Tagging can be simple and versatile, its the search engine that does all the work.
Here the tags, not counting the "filekey", consist of three character sequences ":",
"::" and ":::" but could have been "$word", "def-subs" and "also-see:" 

more fun from the keys found above:

python iq.py -k http://threeseas.net/vic/IQ-ID/thor-arexx.iq SAVEMESSAGE .

and 

python iq.py -k http://threeseas.net/vic/IQ-ID/thor-arexx.iq SAVEMESSAGE INPUTS


The iq.py command is only one of nine. As individual commands there may be other
similiar tools, but Together they provide alot more power of not only searching
but in creating, using the same repetitive command set (a learning thing)
from the users POV. Fundamental automation functionality.  

Using the same functionality to read how something works to integrating its code
into another program, or directly executing it thru automating all inbetween its
access to execution.

But its all the same function/action set to do it. 


Tagging: its about improving accessibility...... and use. 

The last two command line examples in that message are incorrect (remove the "-k" option.)

As mentioned this IQ command is only one of nine commands. Another very similiar
command is the ID command that performs definable tests (sample: idtests.py & idtest.id)
on its input arguement to determine what action to take. This might be the action
to apply one or more of the many prior art post processors mentioned above.

IQ and ID are in stand-alone state (not yet integrated with the other commands)
http://threeseas.net/vic/IQ-ID/

I have moved the project forward, however slowley (converted to python from arexx and
much closer to a complete base), since the 1999 USPTO RFC responce. The command line
base mainly needs to have the KE command written, IQ and ID integrated into it and
general clean up, bug fixing and optimization. Being GPL, its forkable.

Also: the code for IQ, ID and the rest of the project (most current) has been tagged
in a manner compatable to IQ searching.


Big Picture

my first try to understand how it is supposed to work ;-)

Tagging-prototype-big-picture-0.1.png


More Information

Until further integration into this wiki page, this mailing list link
has a thread or two near the bottom with "Tagging prototype" in the subject line, and the following months begining

Perhaps these would make for a Q&A section?

<< back

Groups: