archive-edu.com » EDU » I » ILLINOIS.EDU

Total: 443

Choose link from "Titles, links and description words view":

Or switch to "Titles and links view".
  • HTRC Development Tools
    Center Software Development Tools Confluence JIRA Bamboo Fish Eye Crowd HTRC site HathiTrust site HTRC Sandbox Portal bookworm HTRC NGPD To sign up for an account click here 2012 HathiTrust

    Original URL path: http://sandbox.htrc.illinois.edu/ (2014-12-16)
    Open archived version from archive


  • HTRC Portal -
    a smaller public domain subset of the HathiTrust Read more about the Sandbox or visit the main HTRC Portal Welcome to the HathiTrust Research Center Sandbox About Us The HathiTrust Research Center HTRC provides research access to the public domain text of the HathiTrust Digital Library The HTRC is a collaborative research center launched jointly by Indiana University and the University of Illinois along with the HathiTrust Digital Library to help meet the technical challenges of dealing with massive amounts of digital text that researchers face by developing cutting edge software tools and cyberinfrastructure to enable advanced computational access to the growing digital record of human knowledge The HTRC provides an infrastructure to search collect analyze and visualize the full text of nearly 3 million public domain works and is intended for nonprofit and educational researchers Sign In to Begin What is the Sandbox The HTRC Sandbox is distinct from the main production portal of the HTRC The HTRC Sandbox is meant to be an arena for users to try out experiments and do exploratory work The dataset available on the sandbox is a much smaller subset of that associated with the HTRC s main production portal The HTRC Sandbox

    Original URL path: https://sandbox.htrc.illinois.edu/HTRC-UI-Portal2/ (2014-12-16)
    Open archived version from archive

  • htrc bookworm
    count Case Sensitive Insensitive Smoothing t This Bookworm is an exploration of the HathiTrust Research Center a collaborative research center launched jointly by Indiana University and the University of Illinois along with the HathiTrust Digital Library Bookworm was created by Benjamin Schmidt Department of History Northeastern Matt Nicklay Neva Cherniavsky Durand Martin Camacho and Erez Lieberman Aiden at the Cultural Observatory It enables you to visually explore lexical trends Bookworm

    Original URL path: http://sandbox.htrc.illinois.edu/bookworm/ (2014-12-16)
    Open archived version from archive

  • Acknowledgements
    product suite Atlassian provides best in breed solutions for bug tracking project tracking subversion hosting and wiki software ej technologies We also want to thank ej technologies for kindly providing us with an open source license to their award wining

    Original URL path: http://sandbox.htrc.illinois.edu/acknowledgements.html (2014-12-16)
    Open archived version from archive

  • HTRC Portal -
    a smaller public domain subset of the HathiTrust Read more about the Sandbox or visit the main HTRC Portal Welcome to the HathiTrust Research Center Sandbox About Us The HathiTrust Research Center HTRC provides research access to the public domain text of the HathiTrust Digital Library The HTRC is a collaborative research center launched jointly by Indiana University and the University of Illinois along with the HathiTrust Digital Library to help meet the technical challenges of dealing with massive amounts of digital text that researchers face by developing cutting edge software tools and cyberinfrastructure to enable advanced computational access to the growing digital record of human knowledge The HTRC provides an infrastructure to search collect analyze and visualize the full text of nearly 3 million public domain works and is intended for nonprofit and educational researchers Sign In to Begin What is the Sandbox The HTRC Sandbox is distinct from the main production portal of the HTRC The HTRC Sandbox is meant to be an arena for users to try out experiments and do exploratory work The dataset available on the sandbox is a much smaller subset of that associated with the HTRC s main production portal The HTRC Sandbox

    Original URL path: https://sandbox.htrc.illinois.edu/HTRC-UI-Portal2/HomeAction (2014-12-16)
    Open archived version from archive

  • HTRC Portal - About
    the main HTRC Portal Overview The HathiTrust Research Center HTRC enables computational access for nonprofit and educational users to published works in the public domain and in the future on limited terms to works in copyright from the HathiTrust The HTRC is a collaborative research center launched jointly by Indiana University and the University of Illinois along with the HathiTrust Digital Library to help meet the technical challenges of dealing

    Original URL path: https://sandbox.htrc.illinois.edu/HTRC-UI-Portal2/AboutAction (2014-12-16)
    Open archived version from archive

  • HTRC Portal - About
    language The primary language of the given volume htBibUrl The HathiTrust Bibliographic API call for the volume handleUrl The persistent identifier for the given volume oclc The array of OCLC number s imprint The publication place publisher and publication date of the given volume Features The features extracted from the content of the volume schemaVersion A version identifier for the format and structure of the feature data HTRC generated dateCreated The time the batch of metadata was processed and recorded HTRC generated pageCount The number of pages in the volume pages An array of JSON objects each representing a page of the volume Page Pages are contained within volumes they have a sequence number and information about their header body and footer Page level information seq The sequence number See notes on ID usage tokenCount The total number of tokens in the page lineCount The total number of non empty lines in the page emptyLineCount The total number of empty lines in the page sentenceCount Total number of sentences identified in the page using OpenNLP Details on parsing Header Body and Footer information The fields for header body and footer are the same but apply to different parts of the page Read about the differences between the sections tokenCount The total number of tokens in this page section lineCount The number of lines containing characters of any kind in this page section This represents the layout of the page for sentence counts see the sentenceCount field emptyLineCount The number of lines without text in this page section sentenceCount The number of sentences found in the text in this page section parsed using OpenNLP tokens An unordered list of all tokens characterized by part of speech using OpenNLP and their corresponding frequency counts in this page section Tokens are case sensitive There will be separate counts for instance for rose noun and rose verb while a capitalized Rose is shown as a separate token Words separated by a hyphen across a line break are rejoined No other data cleaning or OCR correction was performed Details on POS parsing and types of tags used beginLineChars Count of the initial character of each line in this page section ignoring whitespace endLineChars Count of the last character on each line in this page section ignoring whitespace Download Links This feature dataset is licensed under a Creative Commons Attribution 4 0 International License Download Below we provide the extracted feature data for download in the form of chronologically sequential bundles consisting of page level features and metadata While we attempted to keep the bundles of similar size the file size varies because we were careful to not split years between bundles Data Sample 126M pre 1850 4 2G 1850 1879 5 5G 1880 1889 3 3G 1890 1899 4 4G 1900 1909 5 7G 1910 1919 5 5G post 1919 2 0G Rsync The data is also set up to be downloaded with rsync This has the benefit of allowing you to download feature files

    Original URL path: https://sandbox.htrc.illinois.edu/HTRC-UI-Portal2/Features (2014-12-16)
    Open archived version from archive

  • HTRC Portal -
    already logged in to HTRC Portal you need to log in to HTRC Workset Builder in order to create or modify a workset Don t Show This Again Cancel Go Upload Workset You can upload Workset only on CSV format Workset Name Only characters A Z 0 9 or allowed CSV File Private Workset Close Upload You are on the HTRC Sandbox This is where new functionality is introduced on

    Original URL path: https://sandbox.htrc.illinois.edu/HTRC-UI-Portal2/ForgotPasswordAction (2014-12-16)
    Open archived version from archive