archive-edu.com » EDU » B » BERKELEY.EDU

Total: 2

Choose link from "Titles, links and description words view":

Or switch to "Titles and links view".
  • AMPLab – UC Berkeley | Algorithms, Machines and People Lab
    resource management and debuggers to find and correct programming errors We are developing datacenter scale implementations of these components so that using a datacenter for analytics becomes as easy as using a computer today People will play a key role in data intensive applications not simply as passive consumers of results but as active providers and gatherers of data and to solve ML hard problems that algorithms on their own cannot solve With crowdsourcing people can be viewed as highly valuable but unreliable and unpredictable resources in terms of both latency and answer quality They must be incentivized appropriately to provide quality answers despite varying expertise diligence and even malicious behavior The AMPLab is addressing these issues in all phases of the analytics lifecycle Events AMPLab Summer Retreat May 18 20 2015 Chaminade Santa Cruz CA By Invitation Only More Info Dissertation Talk Mosharaf Chowdhury AMPLab Mending The Application Network Gap in Big Data Analtyics Friday 5 15 9am 465 Soda More Info Dissertation Talk Gene Pang AMPLab Scalable Transactions for Scalable Distributed Databases W 5 13 noon 405 Soda More Info More Events Featured Project Tachyon A Reliable Memory Centric Storage for Big Data Analytics Memory is the key to fast Big Data processing This has been realized by many and frameworks such as Spark and Shark already leverage memory performance As data sets continue to grow storage is increasingly becoming a critical bottleneck in many workloads To address this need we have developed Tachyon a memory centric fault tolerant distributed file system which enables reliable file sharing at memory speed across cluster frameworks such as Spark and MapReduce The result of over two years of research Tachyon achieves memory speed and fault tolerance by using memory aggressively and leveraging lineage information Tachyon caches working set files in memory

    Original URL path: https://amplab.cs.berkeley.edu/ (2015-05-18)
    Open archived version from archive


  • About | AMPLab – UC Berkeley
    range of tasks Supporting the more varied demands of general data analysis will require a new software infrastructure for WSCs incorporating flexible programming abstractions specifically tailored to the highly parallel datacenter computing environment Massive amounts of new online data provide significantly more raw material for data analysis However this data comes from diverse sources with no common schema and is of variable quality We need radically new data management techniques to tame these huge heterogeneous and highly imperfect datasets The great diversity of data sources will enable a far greater range of queries than those supported by traditional data analysis systems and the ever increasing size of the datasets means that traditional data analytics algorithms will require more computational resources and incur higher delays We thus need far more flexible scalable and tunable analysis algorithms so that over a wide range of queries explicit tradeoffs can be made between delay cost and quality of answer Crowdsourcing allows for the first time large scale and on demand invocation of human input For problems that are ML hard i e are difficult for traditional machine learning and other automated tools crowdsourcing provides an attractive alternative To be widely useful however these crowdsourcing methods must be tightly integrated within more general data analytics frameworks Meeting these challenges will require an entirely new approach that transcends and reshapes disciplinary boundaries The AMPLab is a five year collaborative effort at UC Berkeley involving students researchers and faculty from a wide swath of computer science and data intensive application domains to address the Big Data analytics problem AMP stands for Algorithms Machines and People AMPLab envisions a world where massive data cloud computing communication and people resources can be continually flexibly and dynamically be brought to bear on a range of hard problems by people connected

    Original URL path: https://amplab.cs.berkeley.edu/about/ (2015-05-18)
    Open archived version from archive