Intel(R)
Network of Labs Berkeley Lab Pittsburgh Lab Seattle Lab
Home
About Us
Research Projects
People
Contact Information
Employment/Internship
Links
Intel Research

About Pittsburgh Lab
Intel Research Pittsburgh Delivering Results, Shaping the Future
Introduction
If you visit the Intel Research Pittsburgh lab on the campus of Carnegie Mellon University, you may find a Ph.D. student in deep discussion with an Intel researcher. A seminar might be in session at one end of the lab while a project team at the other end gathers around a computer screen to test their latest software prototype. The atmosphere is collegial, the feeling is open and inviting. What you won't find are Intel researchers holed up in a corner of the lab, speaking in hushed tones, stopping when a student or faculty member walks by.

Intel Research Pittsburgh is one of four labs in the Intel Research Network, an innovative model of industry-university research pioneered by Intel to enhance and accelerate long-term research. Under the model, which stresses openness and collaboration, researchers flow freely between the labs, Intel and the university, conducting joint research projects. It's a win-win arrangement designed to generate breakthrough results.

Each lab in the network focuses on a different field of research that Intel considers important to advancing the vision of proactive computing, in which billions of devices embedded throughout the environment will anticipate people's needs and take action on their behalf. The Pittsburgh lab focuses broadly on the challenges of software running on distributed systems. Research is being conducted in three key areas: distributed systems with intensive computing at the edge; massively distributed systems combined with robotics; and software to exploit many-core architectures.

Since its opening in early 2002, Intel Research Pittsburgh has grown into a thriving research organization. In a few short years, the open and collaborative research model under which the lab operates has proven its ability to deliver results.

About David O'Hallaron

David O'Hallaron is an Associate Professor of Computer Science and Electrical and Computer Engineering at Carnegie Mellon University.  Starting in July 2007, he took a three-year leave of absence to become Director of Intel Research Pittsburgh.

O'Hallaron earned his M.S. and Ph.D. degrees in Computer Science from the University of Virginia. Prior to joining the faculty of Carnegie Mellon in 1989, he was a Computer Scientist at General Electric’s R&D Center in Schenectady, New York. O'Hallaron works in the broad area of computer systems, with specific interests in large scale scientific computing, computational database systems, and virtualization. He is currently leading (with Jacobo Bielak) the Carnegie Mellon Quake project, one of the world’s leading groups developing the capability to predict the motion of the ground during earthquakes. He leads (with Greg Ganger and Natassa Ailamaki) an effort to develop Computational Database Systems that represent massive scientific datasets as database structures and that perform the scientific computing process by creating, querying, and updating these databases.

O’Hallaron has won a number of awards for teaching and research excellence.  He is co-author of two computer science textbooks as well as many peer reviewed articles in leading conferences and journals




About Limor Fix
Limor Fix is the Associate Director of Intel Research Pittsburgh and an Intel Principal Engineer. She earned a Ph.D. in Computer Science from the Technion, the Israeli Institute of Technology. After graduation, she conducted postdoctoral research at Cornell University before joining Intel Israel in 1994. While in Israel, she initiated and led a major change in Intel's validation methodology, developing innovative formal verification tools and methodology that have been widely adopted by Intel's design teams.

Dr. Fix is the author of more than 25 publications and has been invited to serve on more than 15 technical program committees of leading international conferences such as Computer Aided Verification (CAV), Design Automation and Test (DATE), and Correct Hardware Design and Verification Methods (CHARME). For the last three years (2003-2005), Dr. Fix has been a member of the executive committee of the Design Automation Conference (DAC), the premier Electronic Design Automation (EDA) and silicon solution event.

"I think the open and collaborative research model is fantastic," says Mowry. "As a university professor, I've collaborated with the research labs of several companies. You might visit a lab once a year or so, and come away thinking that you should work together more. But in practice, it's difficult to collaborate, because there are typically IP barriers to collaboration." By contrast, collaborating with Intel Research Pittsburgh is easy, and encouraged. Students and faculty move freely between the campus and the lab and engage in informal discussions or formal projects.

Randy Bryant, Dean of the School of Computer Science at Carnegie Mellon, reaffirms the importance of Intel's open and collaborative approach. "The model makes a huge difference" says Bryant. "I've worked in various collaborative relationships with industry, and the issues of confidentiality and of IP are usually big sticking points. Imagine a student trying to do his thesis but also working at the lab. Normally, the student would have to wall off what he's doing at one place from the other and be careful not to intermingle any ideas from the two places. That's very awkward and unnatural, and it's counterproductive. A model that gives students the complete freedom to know they can work at the Intel lab and incorporate that work in a thesis, for example, completely changes the nature of how universities can collaborate with industry. The model makes it easy for information to flow freely between Carnegie Mellon and Intel. As a result, we have developed some very close connections between the two places."





Mutual Benefits
Beyond the open and collaborative research model, the lab provides several benefits to both Carnegie Mellon and Intel, including access to an expanded pool of researchers. "So many exciting ideas are being generated at Carnegie Mellon, and the students and faculty are top notch," says Mowry. "Having the ability to work with them is a real benefit to Intel. In exchange, Carnegie Mellon gains access to the world-class researchers in our lab. This expands the number of researchers who are focusing on similar problems, which should help to accelerate research efforts in those areas."

"Carnegie Mellon University is very strong in computer science and electrical engineering, in the areas that are of interest to us," Limor Fix adds. "We are working with university colleagues in the areas of databases, storage, microarchitecture, and verification. About 15 faculty members and many more of their students are collaborating with our lab, which is valuable to us."

The lab also gives Carnegie Mellon students and faculty access to the rich resources of Intel, and to real- world experience to complement their academic training. "As an industry leader, Intel offers some special resources and insights that can help to shape academic research," says Mowry. "Intel researchers have a good understanding of what's feasible in terms of real hardware, and can share their insights with their Carnegie Mellon collaborators. They provide grounding in what really can be done in industry. For example, in one of the new projects that I've started, called Dynamic Physical Rendering, we are working with Carnegie Mellon to try and build a very futuristic piece of hardware. The Intel researchers who are involved in the project bring valuable real-world experience in the area of hardware that would be difficult for university researchers to acquire on their own."

"Also, some research projects require other kinds of support that may be difficult for university researchers to acquire, such as equipment and access to staff," Mowry continues. "I think the combination of what's easy and challenging in both environments is very complementary."

Working with Intel gives academics the opportunity to potentially see their research translated into real-world technologies. "For any academic who collaborates with us, I think working with the lab and with Intel is a way to amplify their research results," says Fix. "For Carnegie Mellon, the lab is a channel for translating research results into technologies that can make a real impact on the computing world."

Both the university and Intel benefit from the internships the lab offers to Carnegie Mellon students. "Student internships are one of the important benefits of having the Intel lab nearby," says Srinivasan Seshan, Associate Professor in the Computer Science Department at Carnegie Mellon. "The internships expose our students to new projects, ideas, advisors and possible thesis topics. The non-Carnegie Mellon interns also expose our community to the ideas of research groups in other top universities and add to the great talent and interesting activities going on at the lab." Intel also benefits from the collaborative research students engage in during their internships.







Research Progress
Success Stories
Some of the early projects initiated at Intel Research Pittsburgh have made successful transitions from research toward wider deployment and real impact. Five success stories include Internet Suspend/Resume; IrisNet; Open DHT; Autograph and Polygraph; and Diamond. Each of these projects focuses on distributed systems with intensive computing at the edge.

Internet Suspend/Resume
The Internet Suspend/Resume (ISR) project team is exploring the application of virtual machine technology to improve system management and make it easier for users to recover from hardware or software failures. ISR uses virtualization to capture each user's entire computing state such that it can be centrally managed-enabling effective virus scanning, automatic full-system backup, and rapid environment migration. This methodology provides resilience in the presence of hardware or software failures for individual users. It also provides a tool for enterprise catastrophe recovery in the event of building fires or other disasters that affect many users simultaneously.

Internet Suspend/Resume

To test the new technology, the ISR team implemented a pilot deployment on the Carnegie Mellon campus. The pilot was a success, and today some 15 students and faculty of Carnegie Mellon are using ISR on a daily basis, to move their computing environment between their offices and homes. These early users have provided much useful feedback and ideas for improvement and enhancements.

Carnegie Mellon plans to expand the deployment of ISR on its Pittsburgh campus and at additional geographic sites. A National Science Foundation (NSF) grant was recently awarded to Carnegie Mellon for three more years of ISR research. The research will be led by Professor Satya in collaboration with three other professors at Carnegie Mellon University: Dave O'Halloran, Adrian Perrig and Dave Farber.

The ISR technology also is being used in graduate level courses at Carnegie Mellon. Satya has used ISR in a course called "Mobile Computing Systems and Applications" that he co-taught with Professor Dan Siewiorek in spring 2005. The course has a significant project component, and ISR was the centerpiece from which a number of projects were generated (the Intel lab provided laptops with ISR installed). In his graduate course on Internet Services in the spring of 2005, Dave O'Halloran. encouraged graduate students to do group projects related to ISR. Two projects involved putting ISR on a bootable Pocket Hard Drive and a bootable CD-ROM, and another project team evaluated a technique called "ballooning," for shrinking ISR parcels.

IrisNet
Another project that has made a successful transition from the research lab is IrisNet (Internet-scale Resource-Intensive Sensor Network Services). IrisNet is a scalable infrastructure that enables users with Internet access to query Webcams and other globally distributed collections of high-bit-rate sensors. It is a strategically important technology that complements the mote-style sensing being advanced at Intel Research Berkeley. Both sensing technologies are important to enable the ubiquitous computing environments of the future.

The motes pioneered at the Berkeley lab are referred to as "smart dust" because they are tiny, simple machines that are inexpensive enough to be disposable. IrisNet uses much larger sensors, dubbed "brilliant rocks." They have CPU and memory comparable to laptops or handhelds. Compute-intensive algorithms that could not be run on a mote can be run successfully on an IrisNet sensing agent. For example, a sensing agent could handle vision algorithms, such as face recognition code powerful enough to enable identification of a person standing in front of a camera. Several services have been successfully demonstrated using IrisNet, including a monitoring service for a distributed infrastructure (PlanetLab), a parking space finder, and an ocean monitoring service. Multiple versions of the IrisNet source code have been released as open source, and new versions continue to be offered as the system is improved.

IrisNet has been a solid research success, and the research team has published a number of papers in leading venues. The main IrisNet research project has been completed, but several spin-off research activities continue. For example, a technique developed in the IrisNet project for identifying features in images is being applied in a forensic video reconstruction application, as part of the Diamond research project. Other research ideas developed during the course of the IrisNet project are being used by the Dynamic Physical Rendering (DPR) project now underway at the lab.

Open DHT
Systems such as Napster* showed the potential of the peer-to-peer (P2P) approach to building large-scale Internet applications. Distributed hash tables (DHTs)-an improved design for building scalable, robust P2P systems-make it easy to spread an application across hundreds (or millions) of Internet hosts, and reap availability and system capacity benefits. But until recently, developers had to deploy an application-specific DHT for every DHT-based application they created. That required access to a large number of distributed computers that must be managed 24/7-a requirement few developers could meet.

To address this problem, researchers at Intel Research Pittsburgh and Intel Research Berkeley developed a publicly accessible, shared DHT service called Open DHT, with the goal of making it easy to develop distributed applications. Developers no longer need to deploy a DHT for every application they create. Instead, they can access and use Open DHT anytime, from any computer, for any application. The system offers a simple put and get interface, incorporates security features, and is designed to allocate available storage fairly across active clients.

Open DHT has been in limited deployment since April 2004 and was made accessible to the general public in December 2004. The measured performance of the service is promising; researchers report data availability to date at over 99.999%. In addition to the applications developed by the OpenDHT team, a number of other groups are already experimenting with OpenDHT to build a variety of applications, including a video streaming application, an end-user positioning service for location-based applications, a name resolution service and a storage repository for ISP traffic logs. Based on this early experience with Open DHT, researchers believe the service will greatly benefit developers by making it easier to build a range of Internet applications.



Open DHT Architecture

Autograph and Polygraph
Worms are a widely recognized threat to Internet-connected hosts. An Internet worm exploits a software vulnerability on networked servers to install and execute the worm's own code. Once a server is infected, the worm uses it to infect more servers. Interruptions in network services from worms cause significant financial damage; the estimated cost of the Code Red worm epidemic exceeds $1 billion.

To combat the problem, researchers at Intel Research Pittsburgh have developed Autograph, a worm defense system. Autograph monitors all network traffic entering an Internet edge network (e.g., a university's network) and produces signatures for novel worms in the monitored traffic entirely automatically.

There are two key aspects of Autograph's performance. First, Autograph takes advantages of distributed monitoring across many edge networks to generate signatures quickly. Second, Autograph generates high-quality signatures that match only worm traffic, and do not cause false positives.

A full production prototype of the Autograph system is running 24/7 at the Intel Research Pittsburgh lab and at several other early-adopter sites around the network. An open-source release of the Autograph code is now available for download to the public.

Researchers have also created a set of algorithms called Polygraph, which extends Autograph to automatically generate signatures that match polymorphic worms, whose content changes each time they replicate. An open source release of the Polygraph code will also be available for download in the near future.

Diamond (Interactive Exploration of Non-Indexed Data)
The Diamond (Interactive Exploration of Non-Indexed Data) project is a collaborative effort involving Carnegie Mellon University and Intel Research. The goal of the project is to enable rapid, interactive search of terabyte-scale, non-indexed collections of complex data, such as photo collections, satellite pictures and medical images. Diamond achieves this goal by distributing the search across multiple active storage devices (devices with processing capability embedded or nearby), operating in parallel. Files are examined and discarded near their storage locations rather than being sent to a central site for analysis. This process of filtering and "early discard" has been shown to significantly accelerate the search.

To test the Diamond system, researchers have developed a proof-of-concept application that allows users to interactively search large collections of unlabeled photographs, filtering images based on color, texture and frontal human faces. This and other Diamond applications will be made publicly available to the research community in the near future.

Since the Diamond project was launched in January 2003, researchers have made significant progress, improving the efficiency of distributed storage systems, developing new machine learning algorithms to improve the accuracy of searches, and exploring new applications of the system, including a forensic video reconstruction application that could make it easier to solve crimes or thwart terrorist activity by enabling real-time analysis of data gathered from surveillance cameras.

In the future, the Diamond project will focus on applications in digital healthcare and drug discovery. Today Intel Research Pittsburgh and Carnegie Mellon University are applying the Diamond research to real-world problems in bio-medical imaging. In collaboration with the University of Pittsburgh Medical Center, one of the nation's largest and most advanced integrated health systems, they are developing a system of computer-aided diagnosis of pathology images.

The potential for Diamond extends beyond these examples to many fields of scientific endeavor, from botany to astronomy. In general, any researcher who wants to test a hypothesis against a large amount of data could potentially benefit from Diamond's search capabilities.





New Research
Three of the newer research projects initiated at Intel Research Pittsburgh are Dynamic Physical Rendering (DPR), Anti-Spam, and Log-based Architectures. The DPR project combines research in massively distributed systems with robotics. The Anti-Spam project, like the success stories outlined above, addresses the challenges of distributed systems with intensive computing at the edge. The Log-Based Architectures project focuses on the design of software to exploit many-core architectures.

Dynamic Physical Rendering
The goal of the Dynamic Physical Rendering (DPR) project, a collaboration with Carnegie Mellon University, is to create the next media type beyond audio and video. Specifically, researchers are attempting to capture and reproduce 3D scenes, including moving, physical 3D objects, that human senses would accept as real. This would eliminate the need for cumbersome virtual reality gear and overcome the viewing angle limitations of modern 3D approaches. The replicas would mimic the shape and appearance of a person or object being imaged in real time, and as the originals moved, so would their replicas. These 3D models would be physical entities, not holograms. You could interact with them just as if they were in the room with you.

Much work has already been done in capturing 3D images. The major challenge now is reproduce those images as physical, moving 3D replicas. To do that, researchers are creating a form of programmable matter-a kind of high-tech modeling clay which the Carnegie Mellon collaborators have dubbed claytronics.

The basic unit of claytronics is what the Carnegie Mellon researchers refer to as a catom (claytronics atom). Replicas will be formed from ensembles of tiny catoms that bind to one another, via electromagnetic or electrostatic forces, to form physical analogs of virtual shapes.

Claytronics could enable a wide variety of applications, from videoconferences in which real and reproduced people interact at each location to physical 3D modeling of everything from cars to homes. Using this high-tech modeling clay, replicas can be created at different scales. For instance, surgeons could enter a room-size reproduction of a patient's beating heart and perform repairs, which would be transmitted to tiny instruments embedded inside the patient's body, where the actual work would be performed. At the other extreme, sporting events such as the Super Bowl could be reproduced in miniature as moving, physical 3D models on a desk or table top.

Researchers have already built a working two-dimensional prototype of units. Their immediate goal is to learn more about how an ensemble of units behaves before they start to work in three dimensions.

Anti-Spam
Intel researchers are exploring how to improve on existing anti-spam systems, to reduce false positives (legitimate email that is mistakenly tagged as spam) and false negatives (spam that is not filtered out).

To determine if a given message is spam, many anti-spam systems extract tokens such as words from the message. Despite the sophistication of some systems, it's difficult to determine if a message that includes, say, "mortgage" is spam, or an email from a friend who just purchased a home. Users can create whitelists to tell the system to allow mail from certain parties, but manually creating such lists is a time-consuming process.

Intel researchers are investigating a more effective whitelist system that takes advantage of a user's email social network. Under this "friends of friends" approach, users automatically whitelist email addresses that the user's correspondents consider valid. This approach eliminates false positives among the user's immediate and one-hop correspondents, making email reliable among this group of people. Researchers are addressing the privacy concerns of sharing personal whitelist information with other people as well as concerns over forgery (spammers pretending to be one of the people on the user's whitelist) through cryptography and secure protocols.

To catch spam from unknown senders, researchers are also developing a distributed spam rejection system based on collaborative filtering. Under this approach, users in the email network "vote" on whether a message is spam. Once several users declare that a particular message is spam, other users can automatically filter it as such. Traditionally, distributed collaborative filtering has not addressed malicious users who try to distort the voting by submitting multiple votes or by preventing other users' votes from being counted. Intel researchers are exploring efficient techniques to deal with malicious users based on peer-to-peer networks.

Log-Based Architectures
In addition to performance, an important concern for software developers is the time spent developing, modifying, testing, and supporting the code so that it behaves correctly in the field. Generally, programs "misbehave" because it's difficult to write bug-free code, especially for large software systems whose desired functionality may change and which are modified or extended over time.

To ensure that software behaves properly once it's deployed, Intel researchers are exploring the development of tools called "lifeguards" to monitor software programs and identify problems such as security attacks or incorrect behavior. At a minimum, lifeguards would notify the user of the problem; ideally, they would intervene to correct the problem.

Existing lifeguard tools exist, but many of them slow down the program being monitored by a factor of two to ten or more. To address this problem, researchers are taking advantage of Intel's multi-core processors, so that the lifeguard tool could be run in the background on one core, continuously monitoring a software program on another core, without slowing the performance of the main program. A logging mechanism that is integrated into the processor and system architecture would allow lifeguards to recognize and fix correctness and performance problems automatically.

Log-based architectures could greatly improve programmer productivity and software reliability, which will be increasingly important as we move to many-core architectures. By enabling lifeguards to run with negligible run-time overhead, researchers could provide significant value added from a multi-core architecture, beyond simple parallel execution to improve performance.





Moving Forward

In April 2005, Intel Research Pittsburgh moved from its temporary off-campus location to a permanent home in the new Collaborative Innovation Center (CIC) on the Carnegie Mellon campus. "We're located just a few hundred feet away from the computer science department, and we're very excited about that," says Mowry. The CIC is a four-story glass and masonry building that was designed with the intent of attracting high-tech companies to the Pittsburgh community, making it an ideal setting for Intel Research Pittsburgh.

Although the lab's former location was close enough to the university to encourage collaboration, having the lab on campus is fostering even closer connections with Carnegie Mellon researchers, according to Mowry. "The physical proximity is crucial," he says. "We have students who are basically doing their dissertations based on their collaborative projects at the lab. In many cases, Intel researchers serve on the thesis committees of students working at the lab. Our close proximity has resulted in some high-quality collaborations that would be difficult to achieve if we were physically located farther from the university." Those collaborations have already translated into the research successes highlighted earlier, including Internet Suspend/Resume, IrisNet, Open DHT, Autograph and Polygraph, and Diamond.

These initial projects benefited greatly from Carnegie Mellon's research strengths in the area of distributed systems. The lab's new projects, such as Dynamic Physical Rendering and Log-based Architectures, are tapping into additional research strengths of the university. "We are extending our collaborations to Carnegie Mellon researchers in the areas of robotics, computer architecture, formal verification, and storage systems," Mowry notes. "We look forward to working together with our university colleagues to produce more research success stories in the years to come."

    *Legal Information Privacy Policy Contact Us Copyright 2007 Intel Corporation