Classes for Spring 2010

January 11th, 2010 Irfan Essa Posted in Computational Journalism, Teaching | No Comments »

Happy 2010! In Spring Term 2010, I am teaching the following two classes.

Computation + Journalism (CS 4464 / CS 6465)

This class is aimed at understanding the computational and technological advancements in the area of journalism. Primary focus is on the study of technologies for developing new tools for (a) sense-making from diverse news information sources, (b) the impact of more and cheaper networked sensors (c) collaborative human models for information aggregation and sense-making, (d) mashups and the use of programming in journalism, (e) the impact of mobile computing and data gathering, (f) computational approaches to information quality, (g) data mining for personalization and aggregation, and (h) citizen journalism.

Computing, Society and Professionalism (CS 4001)

Although Computing, Society and Professionalism is a required course for CS majors, it is not a typical computer science course. Rather than dealing with the technical content of computing, it addresses the effects of computing on individuals, organizations, and society, and on what yourresponsibilities are as a computing professional in light of those impacts. The topic is a very broad one and one that you will have to deal with almost every day of your professional life. The issues are sometimes as intellectually deep as some of the greatest philosophical writings in history – and sometimes as shallow as a report on the evening TV news. This course can do little more than introduce you to the topics, but, if successful, will change the way you view the technology with which you work.

Tags: , ,

AddThis Social Bookmark Button

Paper Advanced Robotics (2009): “Human Action Recognition Using Global Point Feature Histograms and Action Shapes”

October 29th, 2009 Irfan Essa Posted in Activity Recognition, Franzi Meier, Intelligent Environments, Michael Beetz, Papers | No Comments »

Radu Bogdan Rusu, Jan Bandouch, Franziska Meier, Irfan Essa and Michael Beetz (2009) “Human Action Recognition Using Global Point Feature Histograms and Action Shapes”, in Journal of Advanced Robotics, volume 23, pages 1873–1908, Koninklijke Brill NV, Leiden and The Robotics Society of Japan, 2009. [ DOI | PDF]

Abstract

This paper investigates the recognition of human actions from three-dimensional (3-D) point clouds that encode the motions of people acting in sensor-distributed indoor environments. Data streams are time sequences of silhouettes extracted from cameras in the environment. From the 2-D silhouette contours we generate space–time streams by continuously aligning and stacking the contours along the time axis as third spatial dimension. The space–time stream of an observation sequence is segmented into parts corresponding to subactions using a pattern matching technique based on suffix trees and interval scheduling. Then, the segmented space–time shapes are processed by treating the shapes as 3-D point clouds and estimating global point feature histograms for them. The resultant models are clustered using statistical analysis and our experimental results indicate that the presented methods robustly derive different action classes. This holds despite large intra-class variance in the recorded datasets due to performances from different persons at different time intervals.

© Koninklijke Brill NV, Leiden and The Robotics Society of Japan, 2009

Overview of the approach.

Overview of the approach.

Keywords: Action recognition, point cloud, global features, action segmentation

Tags: , ,

AddThis Social Bookmark Button

Paper ISMAR 2009 (IEEE International Symposium on Mixed and Augmented Reality): “Augmenting Aerial Earth Maps with Dynamic Information”

October 20th, 2009 Irfan Essa Posted in Computational Journalism, Computational Photography and Video, Kihwan Kim, Modeling and Animation, Papers | No Comments »

Kihwan Kim, Sangmin Oh, Jeonggyu Lee and Irfan Essa (2009), “Augmenting Aerial Earth Maps with Dynamic Information,” In Proceedings of IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Orlando, FL, USA, October 2009 [Project Site, Video (AVI/DiVX), Video (Youtube) Paper (pdf)].

Abstract

We introduce methods for augmenting aerial visualizations of Earth (from tools such as Google Earth or Microsoft Virtual Earth) with dynamic information obtained from videos. Our goal is to make Augmented Earth Maps that visualize the live broadcast of dynamic sceneries within a city. We propose different approaches to analyze videos of pedestrians and cars, under differing conditions and then augment Aerial Earth Maps (AEMs) with live and dynamic information. We also analyze natural phenomenon (clouds) and project information from these to the AEMs to add the visual reality.

Tags: , , , , ,

AddThis Social Bookmark Button

In the News (2009): CNN.com “Augmenting Earth Maps”

October 13th, 2009 Irfan Essa Posted in In The News, Kihwan Kim | No Comments »

Video – Breaking News Videos from CNN.com.

Check out the media coverage of our new paper to appear in ISMAR 2009, in October.

Also see

  • “Latest videos makes Google Earth cities bustle” New Scientist (Sep 30, 2009 Issue)
  • “Video: Google Earth animated with real time human and vehicular traffic” Endgadget (Sep 30, 2009)

Tags: , ,

AddThis Social Bookmark Button

Event (2009): IEEE Workshop on Computer Vision for Humanoid Robots in Real Environment

September 23rd, 2009 Irfan Essa Posted in Events | No Comments »

IEEE Workshop on Computer Vision for Humanoid Robots in Real Environments.

I am co-organizing the First IEEE Workshop on Computer Vision for Humanoids in conjunction with ICCV Conference in Kyoto, Japan.  This workshop will be held September 27, 2009. (9:30am – 6:00pm).

The goal of this workshop is to bring together experts from the fields of computer vision and robotics that are working on humanoid robots with vision as one of the primary modalities. Topics of interest include and are not limited to:

  • Visual Learning in Robots
  • Human Robot Interaction
  • Grasping and Manipulation
  • Learning by Demonstration
  • Task Learning for Robots
  • Activity Recognition and Discovery for Robot
  • Humanoid Navigation in Real Environments
  • Vision Devices and Systems for Robot Applications
  • Application of Humanoid Robots (Indoor/Outdoor, Entertainment)

This is the first attempt at a workshop that crosses from Humanoids Research to Computer Vision Research.

The workshop includes six invited talks as well as an open poster session, where all participants are expected to present a poster describing their recent work.

Location: Kyoto University, Faculty of Engineering Bldg.#3, 2F, Room W201, in conjunction with ICCV. (See http://www.iccv2009.org/workshops/index.html).

For schedule, abstracts and other information, see the workshop website. More information about ICCV at http://www.iccv2009.org/.

Invited Speakers and Organizers after the Workshop

Invited Speakers and Organizers after the Workshop

Tags: , , ,

AddThis Social Bookmark Button

PERSONAL: Recently upgraded to Irfan v 1.1

July 11th, 2009 Irfan Essa Posted in Personal | No Comments »

See my personal weblog at http://personal.irfanessa.com/2009/07/13/irfan-updated-to-v1-1/

Tags: ,

AddThis Social Bookmark Button

Paper (2009) In IEEE Transactions on Visualization and CG “Fluid Simulation with Articulated Bodies”

June 10th, 2009 Irfan Essa Posted in Greg Turk, Modeling and Animation, Nipun Kwatra | No Comments »

Nipun Kwatra, Chris Wojtan, Mark Carlson, Irfan A. Essa, Peter J. Mucha, Greg Turk (2009), “Fluid Simulation with Articulated Bodies“, IEEE Transactions on Visualization and Computer Graphics, 10 Jun. 2009. IEEE computer Society Digital Library. IEEE Computer Society. [DOI | PDF (see copyright) | Video | Website]

Abstract

We present an algorithm for creating realistic animations of characters that are swimming through fluids. Our approach combines dynamic simulation with data-driven kinematic motions (motion capture data) to produce realistic animation in a fluid. The interaction of the articulated body with the fluid is performed by incorporating joint constraints with rigid animation and by extending a solid/fluid coupling method to handle articulated chains. Our solver takes as input the current state of the simulation and calculates the angular and linear accelerations of the connected bodies needed to match a particular motion sequence for the articulated body. These accelerations are used to estimate the forces and torques that are then applied to each joint. Based on this approach, we demonstrate simulated swimming results for a variety of different strokes, including crawl, backstroke, breaststroke and butterfly. The ability to have articulated bodies interact with fluids also allows us to generate simulations of simple water creatures that are driven by simple controllers.

teaser

Tags: , ,

AddThis Social Bookmark Button

Time Magazine (2009) Article “Can Computer Nerds Save Journalism?”

June 8th, 2009 Irfan Essa Posted in Computational Journalism, In The News, Interesting | No Comments »

Can Computer Nerds Save Journalism?, TIME Magazine, by MATT VILLANO, June 8, 2009

EXCERPT

“At the Georgia Institute of Technology in Atlanta, a three-year-old program in “computational journalism” helps computer-science majors study how journalists gather, organize and utilize information, then take these workflows and see how technology can make the processes easier.”

Full article here. Also see CnJ site.

Tags: , ,

AddThis Social Bookmark Button

Presentation at International Workshop on Video (2009): “Temporal Representations of Video for Analysis and Synthesis”

May 26th, 2009 Irfan Essa Posted in Computational Photography and Video, Presentations | No Comments »

“Temporal Representations of Video for Analysis and Synthesis” at IWV09: International Workshop on Video, In Barcelona, SPAIN, May 25-27, 2009.

(Slides, NO Video)

Abstract

I will present a variety of temporal models of video that we have been studying (and developing on) for analysis and synthesis of video. Forsynthesis of videos, we have been developing representations that support example-based re-synthesis and spatio-temporal re-targeting. These approaches build on graph-based methods and we present techniques for similarity metrics for video, segmentation in video, and merging of different video streams. I will showcase a series of examples of these approaches applied to generate new videos.

For analysis of videos, we have developed a series of representations to observe and model activities in videos. Building on low-level measures of movement and motion in videos, we have incorporated higher-level temporal generative models to represent and recognize observed activities. I will discuss the strengths of a variety of State-based, Markovian, Grammar-based and Network-based representations that we have employed for recognizing activities from video. I will also discuss approaches for unsupervised discovery and recognition of activities.

Time permitting, I will describe some new efforts that move towards understanding mobile imaging and video, and video authoring and video on the web, Within these I will discuss issues of collaborative imaging, collective authoring, ad-hoc sensor networks, and peer production with images and videos. Using these concepts, to focus the conversation, I will discuss how all of these issues are impacting the field Journalism and Reporting and how we have started on a new interdisciplinary research and education effort, we call Computational Journalism.

Tags: , , ,

AddThis Social Bookmark Button

Presentation at CMU’s Computational Thinking Seminar Series (2009): “From Computational Photography and Video to Computational Journalism”

March 10th, 2009 Irfan Essa Posted in Computational Journalism, Computational Photography and Video, Presentations | No Comments »

From Computational Photography and Video to Computational Journalism

Irfan Essa
Georgia Institute of Technology
School of Interactive Computing, GVU and RIM Centers
April 21, 2009.

(see the video of this presentation)

Abstract

essa_poster_b

Our consumption of images (photography/video) continues to grow with the pervasiveness of computing (networking, mobile and media) technologies into our daily lives. Everyone now has a mobile camera, and digital image capture, processing, and sharing has become ubiquitous in our society. This has led to a significant impact on we want to (a) create novel scenes, (b) share our experiences with images, and (c) interact with  large amounts of images and videos from many sources. In this talk, I will start  with a brief overview of series of ongoing efforts in the analysis of images and videos for rendering novel scenes, interacting with images/videos and collaboratively authoring new content. I will describe some work on video-based rendering and synthesizing novel videos (and scenes) and highlight the technical contributions being made in areas of Computational Photography and Video.

Using these sets of efforts as a foundation I will showcase where things are headed in terms of user generated content, media sharing, annotation, and reuse with large scale networks. In essence, everybody is a content, producer, distributor, and consumer. I will describe some new efforts that move towards understanding mobile imaging and video, and also discuss issues of collaborative imaging, collective authoring, ad-hoc sensor networks, and peer production with images and videos.  Using these concepts I will discuss how all of these issues are impacting the field Journalism and Reporting and how we have started on a new interdisciplinary research and education effort, we call Computational Journalism.  The concept of Computational Journalism includes more than just imaging, and relates to media and information in general and is aimed at the study of how we remain informed in this connected world. I will outline this new field and relate it back to imaging, with examples from some of our recent work in this new area.

Tags: , , , ,

AddThis Social Bookmark Button

Paper (2009) ACM CHI: “Videolyzer: Quality Analysis of Online Informational Video for Bloggers and Journalists”

March 4th, 2009 Irfan Essa Posted in ACM UIST/CHI, Computational Journalism, Computational Photography and Video, Nick Diakopoulos | No Comments »

N. Diakopoulos, S. Goldenberg, I. Essa (2009). “Videolyzer: Quality Analysis of Online Informational Video for Bloggers and Journalists.” ACM Conference on Human Factors in Computing Systems (CHI). April, 2009. [PDF] [Project Site] [Video(CHI 2009 – Digital Life New World – CHI 2009 Advance Program)

Abstract

Screen Shot of Videolyzer

Tools to aid people in making sense of the information quality of online informational video are essential for media consumers seeking to be well informed. Our application, Videolyzer, addresses the information quality problem in video by allowing politically motivated bloggers or journalists to analyze, collect, and share criticisms of the information quality of online political videos. Our interface innovates by providing a fine-grained and tightly coupled interaction paradigm between the timeline, the time-synced transcript, and annotations. We also incorporate automatic textual and video content analysis to suggest areas of interest for further assessment by a person. We present an evaluation of Videolyzer looking at the user experience, usefulness, and behavior around the novel features of the UI as well as report on the collaborative dynamic of the discourse generated with the tool.

Tags: , , , ,

AddThis Social Bookmark Button

Paper (2009) In ACM Symposium on Interactive 3D Graphics “Human Video Textures”

March 1st, 2009 Irfan Essa Posted in ACM SIGGRAPH, Computational Photography and Video, James Rehg, Matt Flagg, Modeling and Animation, Papers, Sing Bing Kang | No Comments »

Matthew FlaggAtsushi Nakazawa, Qiushuang Zhang, Sing Bing Kang, Young Kee Ryu, Irfan EssaJames M. Rehg (2009), Human Video Textures In Proceedings of the ACM Symposium on Interactive 3D Graphics and Games 2009 (I3D ’09), Boston, MA, February 27-March 1 (Fri-Sun), 2009 [PDF (see Copyright) | Video in DiVx | Website ]

Abstract

This paper describes a data-driven approach for generating photorealistic animations of human motion. Each animation sequence follows a user-choreographed path and plays continuously by seamlessly transitioning between different segments of the captured data. To produce these animations, we capitalize on the complementary characteristics of motion capture data and video. We customize our capture system to record motion capture data that are synchronized with our video source. Candidate transition points in video clips are identified using a new similarity metric based on 3-D marker trajectories and their 2-D projections into video. Once the transitions have been identified, a video-based motion graph is constructed. We further exploit hybrid motion and video data to ensure that the transitions are seamless when generating animations. Motion capture marker projections serve as control points for segmentation of layers and nonrigid transformation of regions. This allows warping and blending to generate seamless in-between frames for animation. We show a series of choreographed animations of walks and martial arts scenes as validation of our approach.

Example Image from Project

Human Video Textures (Output Rendered as a Collage!)

Tags: , , , , , ,

AddThis Social Bookmark Button

EVENT: CVPR 2009 Decisions are Announced.

February 25th, 2009 Irfan Essa Posted in Events, PAMI/ICCV/CVPR/ECCV | No Comments »

Tags: , ,

AddThis Social Bookmark Button

Paper (2009): ICASSP “Learning Basic Units in American Sign Language using Discriminative Segmental Feature Selection”

February 4th, 2009 Irfan Essa Posted in Face and Gesture, Funding, ICASSP, James Rehg, NSF (0205507), Numerical Machine Learning, Pei Yin, Thad Starner | No Comments »

Pei Yin, Thad Starner, Harley Hamilton, Irfan Essa, James M. Rehg (2009), ”Learning Basic Units in American Sign Language using Discriminative Segmental Feature Selection” in IEEE Conference on Acoustics, Speech, and Signal Processing 2009 (ICASSP 2009). Session: Spoken Language Understanding I, Tuesday, April 21, 11:00 – 13:00, Taipei, Taiwan.

ABSTRACT

The natural language for most deaf signers in the United States is American Sign Language (ASL). ASL has internal structure like spoken languages, and ASL linguists have introduced several phonemic models. The study of ASL phonemes is not only interesting to linguists, but also useful for scalability in recognition by machines. Since machine perception is different than human perception, this paper learns the basic units for ASL directly from data. Comparing with previous studies, our approach computes a set of data-driven units (fenemes) discriminatively from the results of segmental feature selection. The learning iterates the following two steps: first apply discriminative feature selection segmentally to the signs, and then tie the most similar temporal segments to re-train. Intuitively, the sign parts indistinguishable to machines are merged to form basic units, which we call ASL fenemes. Experiments on publicly available ASL recognition data show that the extracted data-driven fenemes are meaningful, and recognition using those fenemes achieves improved accuracy at reduced model complexity

Tags: , ,

AddThis Social Bookmark Button

Presentation at Duke University (2009): “Computation & Journalism: The Impact of Technology on Journalism, Information Quality, and Civic Literacy”

January 10th, 2009 Irfan Essa Posted in Computational Journalism, Presentations | No Comments »

Talk/Presentation at Duke University, Jan 27, 2009. Hosted by  James Hamilton, director of the DeWitt Wallace Center for Media and Democracy at Duke University

Computation & Journalism: The Impact of Technology on Journalism, Information Quality, and Civic Literacy

Irfan Essa
Georgia Institute of Technology
School of Interactive Computing, GVU and RIM Centers 

Fundamentally, journalism is the process of collecting news information and disseminating that information with a layer of contextualization and understanding provided by journalists in the form of a news story. Recent advances in computational technology are rapidly affecting how news is gathered, reported, and distributed, and how stories are authored and told. New technologies for aggregating, visualizing, summarizing, consuming, and collaborating on news are becoming increasingly popular. Theses advances are challenging the traditional practices of journalism and directly affecting the future of news production and consumption. Both computation and journalism share a deep interest in information and the value it provides to society, and they are deeply involved in the future of storytelling in various contexts, especially current events. This requires us to consider how both Computation and Journalism can help each other. 

In this talk, I will present a vision for a new area of research and education that brings together the fields of computation and journalism together to enhance both these disciplines and supports a creation of a “Computationalist-Journalist.,” a new kind of participant in the public conversation. I will start by describing how imaging, video, and media production and consumption has changed with technology and then how similar technologies can be used for Journalism and related Civic Literacy issues. I will describe new technologies that have changed the landscape of both Computation and Journalism and use these developments to showcase, where we are headed to with both Computation and Journalism, and technologists and journalists together to create new computing tools that further the aims of journalism. 

Bio

Tags: , ,

AddThis Social Bookmark Button

INTERESTING: “Deep Throat Meets Data Mining”

December 24th, 2008 Irfan Essa Posted in Computational Journalism, In The News, Interesting | No Comments »

“Deep Throat Meets Data Mining”

by JOHN MECKLIN

Dec 23, 2008 in Miller-McCune

If you pay passing attention to the media landscape, you know that most mainstream news outlets have had their business models undermined by the digital revolution. As their general-interest monopolies have been pillaged by niche online competitors, traditional news organizations have lost revenue and cachet, laying off journalists in waves that have grown into tsunamis. This process has created dire prospects for the future of investigative reporting, often seen as the most costly of journalistic forms.”

Goes on to mention Computational Journalism and our (at GA Tech) and recent Duke University’s efforts in this space and few others.

Tags: , ,

AddThis Social Bookmark Button

Paper: ICPR (2008) “3D Shape Context and Distance Transform for Action Recognition”

December 8th, 2008 Irfan Essa Posted in Activity Recognition, Aware Home, Face and Gesture, Franzi Meier, Matthias Grundmann, PAMI/ICCV/CVPR/ECCV, Papers | 1 Comment »

M. Grundmann, F. Meier, and I. Essa (2008) “3D Shape Context and Distance Transform for Action Recognition”, In Proceedings of International Conference on Pattern Recognition (ICPR) 2008, Tampa, FL. [Project Page | DOI | PDF]

ABSTRACT

We propose the use of 3D (2D+time) Shape Context to recognize the spatial and temporal details inherent in human actions. We represent an action in a video sequence by a 3D point cloud extracted by sampling 2D silhouettes over time. A non-uniform sampling method is introduced that gives preference to fast moving body parts using a Euclidean 3D Distance Transform. Actions are then classified by matching the extracted point clouds. Our proposed approach is based on a global matching and does not require specific training to learn the model. We test the approach thoroughly on two publicly available datasets and compare to several state-of-the-art methods. The achieved classification accuracy is on par with or superior to the best results reported to date.

Tags: , ,

AddThis Social Bookmark Button

Disney Research, Pittsburgh

October 23rd, 2008 Irfan Essa Posted in Jessica Hodgins | No Comments »

This academic year, I am spending some time working with the newly formed Disney Research, Pittsburgh, (Directed by Jessica Hodgins) formed next to CMU.  The press release is announcing this lab is here (Carnegie Mellon SCS Press Release). I am also hanging out with folks at the CMU Robotics Institute and have started some new collaborations.  So now depending on when, you can find me either in Atlanta (at GA Tech) or in Pittsburgh (at Disney Lab or CMU) [OR on a airplane between Pittsburgh and Atlanta].

Tags: , ,

AddThis Social Bookmark Button

Paper: ACM Multimedia (2008) “Audio Puzzler: Piecing Together Time-Stamped Speech Transcripts with a Puzzle Game”

October 18th, 2008 Irfan Essa Posted in ACM MM, Computational Journalism, Multimedia, Nick Diakopoulos, Papers | No Comments »

N. Diakopoulos, K. Luther, I. Essa (2008), “Audio Puzzler: Piecing Together Time-Stamped Speech Transcripts with a Puzzle Game.” In Proceedings of  ACM International Conference on Multimedia 2008. Vancouver, BC, CANANDA  [Project Link]

ABSTRACT

We have developed an audio-based casual puzzle game which produces a time-stamped transcription of spokenapaudio as a by-product of play. Our evaluation of the game indicates that it is both fun and challenging. The transcripts generated using the game are more accurate than those produced using a standard automatic transcription system and the time-stamps of words are within several hundred milliseconds of ground truth.

Tags: , ,

AddThis Social Bookmark Button

Research: Videolyzer (Online DEMO, try it out!)

October 15th, 2008 Irfan Essa Posted in Collaborators, Computational Journalism, Nick Diakopoulos, Projects | No Comments »

An Online DEMO of Videolyzer, a project by my PhD Student, Nick Diakopolous.

Videolyzer is a tool designed to help journalists and bloggers collect, organize, and present information about the quality (i.e. validity, reliability, etc.) of online videos. It makes it possible to evaluate and make sense of things like comments, claims, and sources as they relate to the video. Users can comment and annotate pieces of the video (called “anchors”) to provide a more fine-grained description of the information in the video. The interface also incorporates a tightly integrated transcript of what’s spoken in the video to make it easier to navigate the dense information there. Finally, Videolyzer allows for collaboration among many people. Users can build off of each other’s annotations and rate each other in a form of distributed vetting and peer-evaluation.

Tags: , , ,

AddThis Social Bookmark Button

Paper: ISWC (2008) “Localization and 3D Reconstruction of Urban Scenes Using GPS”

September 28th, 2008 Irfan Essa Posted in ISWC, Kihwan Kim, Mobile Computing, Papers, Thad Starner | No Comments »

Kihwan Kim, Jay Summet, Thad Starner, Daniel Ashbrook, Mrunal Kapade and Irfan Essa  (2008) “Localization and 3D Reconstruction of Urban Scenes Using GPS” In Proceedings of IEEE Symposium on Wearable Computing (ISWC) 2008 (To Appear). [PDF]

ABSTRACT

research_gpsray

Using off-the-shelf Global Positioning System (GPS) units, we reconstruct buildings in 3D by exploiting the reduction in signal to noise ratio (SNR) that occurs when the buildings obstruct the line-of-sight between the moving units and the orbiting satellites. We measure the size and height of skyscrapers as well as automatically constructing a density map representing the location of multiple buildings in an urban landscape.  If deployed on a large scale, via a cellular service provider’s GPS-enabled mobile phones or GPS-tracked delivery vehicles, the system could provide an inexpensive means of continuously creating and updating 3D maps of urban environments.

Tags: , , , , ,

AddThis Social Bookmark Button

Paper: Pragmatic Web (2008) “An Annotation Model for Making Sense of Information Quality in Online Videos”

September 28th, 2008 Irfan Essa Posted in Computational Journalism, Multimedia, Nick Diakopoulos, Papers | No Comments »

N. Diakopoulos, I. Essa. (2008) “An Annotation Model for Making Sense of Information Quality in Online Videos.” Proceedings of the International Conference on the Pragmatic Web. 28–30 Sept. 2008, Uppsala, Sweden (To Appear)

ABSTRACT

Making sense of the information quality of online media including things such as the accuracy and validity of claims and the reliability of sources is essential for people to be well-informed. We are developing Videolyzer to address the challenge of information quality sense-making by allowing motivated individuals to analyze, collect, share, and respond to criticisms of the information quality of online political videos and their transcripts. In this paper specifically we present a model of how the annotation ontology and collaborative dynamics embedded in Videolyzer can enhance information quality.

Tags: ,

AddThis Social Bookmark Button

Presentation: At Qualcomm Research in San Diego, CA (2008) “From Computational Photography and Video to Computational Journalism”

September 24th, 2008 Irfan Essa Posted in Computational Journalism, Computational Photography and Video, Presentations | No Comments »

From Computational Photography and Video to Computational Journalism

 
Abstract

Digital image capture, processing, and sharing has become pervasive in our society. This has had significant impact on how we create novel scenes, how we share our experiences, and how we interact with images and videos. In this talk, I will present an overview of series of ongoing efforts in the analysis of images and videos for rendering novel scenes. First I will discuss (in brief) our work on Video Textures, where repeating information is extracted to generate extended sequences of videos. I will also describe some our extensions to this approach that allows for controlled generation of animations of video sprites. We have developed various learning and optimization techniques that allow for video-based animations of photo-realistic characters. Using these sets of approaches as a foundation, then I will show how new images and videos can be generated. I will show examples of Photorealistic and Non-photorealistic Renderings of Scenes (Videos and Images) and how these methods support the media reuse culture, so common these days with user generated content.   I will then describe some of our new efforts that move towards understanding mobile imaging and video, and also discuss issues of collaborative imaging and authoring and ad-hoc sensor networks, and peer production with images and videos, leading to a new concepts of how computation has impacted journalism. Time permitting, I will also share some of our efforts on video annotation and how we have taken some of these new concepts of video analysis to classrooms.

Tags: , , ,

AddThis Social Bookmark Button

Funding (2007): NSF “Web on Demand – Bridging the Gap Between Social Networks and Ad Hoc Networking”

September 1st, 2008 Irfan Essa Posted in Computational Journalism, Kishore Ramachandran, Mobile Computing | No Comments »

Award#0834545 – CSR-DMSS, SM: Web on Demand – Bridging the Gap Between Social Networks and Ad Hoc Networking

Investigator(s): Umakishore Ramachandran, (Principal Investigator), Irfan Essa (Co-Principal Investigator)

Dates: September 1, 2008 – August 31, 2009 (Estimated)

Abstract

From the western world to the third world, the use of handheld devices (cellphones, PDAs) has proliferated. The world of users is becoming both wireless and mobile. Web 2.0 has ushered in an age wherein the web is viewed as a provider of services and not just a repository of documents and/or information. Despite this advance, the web remains just that, a single web with an inherent assumption that a powerful computing and communication infrastructure supports it. Couldn’t mobile wireless devices in close proximity form a web of their own? This is the vision behind this project, the Web on Demand (WoD). WoD aims at bridging the gap between social networks and ad hoc networking. In other words, it aims to rethink the system software stack all the way from application to networking that would allow the creation and management of social networks without any assumption of infrastructure support. The core of the research is to develop software technologies for mobile devices that would allow the dynamic creation of thematic ad hoc overlay networks empowering (a) mobile people with similar interests (e.g., weather forecast), (b) friends and family (e.g., in a theme park), and (c) participants in mission critical applications (e.g., search and rescue), stay connected. WoD complements the World Wide Web (WWW) and leverages it when it is available, such as exploiting the ambient computing infrastructure to enhance user experience, and managing the dynamic creation of User Generated Content (UGC) by mobile users. The vision behind this project is to democratize access to services that are currently offered through WWW. In this sense, the results from this research can have far-reaching technological and societal consequences. Most importantly, the research will help breed a new class of computer scientists who are connected with societal causes in addition to advancing technology.

Tags: , , , ,

AddThis Social Bookmark Button

Teaching: CS 4480 DVFX, Fall 08 “viral edition”

August 19th, 2008 Irfan Essa Posted in DVFX, Frank Dellaert | No Comments »

I am very pleased that my colleague (and friend) Professor Frank Dellaert has taken over my DVFX class that I have been teaching since 1999 (see site here).  It is clear already that this new edition of the DVFX class will be even more exciting then the previous editions.  Can’t wait to see the final videos. Check out the info on the class at CS 4480 DVFX, Fall 08.

Tags:

AddThis Social Bookmark Button

Event: ACM Siggraph 2008 Class on Computional and Journalism

August 12th, 2008 Irfan Essa Posted in Computational Journalism, Events, SIGGRAPH/SCA/NPAR/EG | 3 Comments »

ACM Siggraph 2008 Class on Computional and Journalism

  • Date and Time: Wednesday, 13 August 2008 | 1:45 pm – 5:30 pm
  • Location: Room 502 A, Los Angeles Convention Center, Los Angeles, CA, USA

Fundamentally, journalism is the process of collecting news information and disseminating that information with a layer of contextualization and understanding provided by journalists in the form of a news story. Recent advances in computational technology are rapidly affecting how news is gathered, reported, and distributed, and how stories are authored and told. New technologies for aggregating, visualizing, summarizing, consuming, and collaborating on news are becoming increasingly popular. They are challenging the traditional practices of journalism and directly affecting the future of news production and consumption. Computation and journalism share a deep interest in information and the value it provides to society, and they are deeply involved in the future of storytelling in various contexts, especially current events. This class summarizes how these new technologies affect journalism, both at the core of the journalism discipline and in its practice and business. Topics include: the technologies that have empowered citizen journalism and related citizen media production and authoring; mobile and sensing technologies that allow journalism to become ubiquitous and pervasive; the changes in photo, video, and broadcast journalism; and how web, online, and science journalism are changing the basic processes of reporting. Instructors focus especially on areas of special interest to the SIGGRAPH community: photography and video, large-scale information visualization, and social networking.

Presentations will be made by:

This course is open to all registrant of ACM SIGGRAPH 2008 and has not pre-requisite requirements. See the info on ACM SIGGRAPH Site

Tags: ,

AddThis Social Bookmark Button

Research: Audio Puzzler Alpha

August 7th, 2008 Irfan Essa Posted in Computational Journalism, Nick Diakopoulos | No Comments »

Audio Puzzler Alpha (ONLINE DEMO)

By Nick Diakopoulos (My PhD Student)

Audio Puzzler is a new kind of puzzle game based on unauthored content found online. The audio for the puzzles is taken from popular or interesting video clips from different genres such as news, documentary, or television. The audio puzzler is the type of game that harnesses people’s play to also provide valuable data which enriches the content played with. This is in the same vein as the ESPGame, the Listen Game, and PhotoPlay, which are all games which gather data in the process of game play. But while the data collected by these other games is useful for machine learning, the data collected with audio puzzler is immediately valuable as a transcription of the speech in the video. A similar effort (but in a much grander domain) is the Fold It project which seeks to harness playtime to solve protein folding problems. Much more detailed information about the evaluation of the technology will be forthcoming in a paper to be published at ACM Multimedia in October.

Tags: ,

AddThis Social Bookmark Button

Interesting: The Changing Newsroom | Project for Excellence in Journalism PEJ

August 7th, 2008 Irfan Essa Posted in Computational Journalism, Interesting | No Comments »

The Changing Newsroom | Project for Excellence in Journalism PEJ

An analysis of the changing world of Journalism.  Worth a read. States how the newsroom and the print media are especially impacted.

Tags:

AddThis Social Bookmark Button

Thesis Raffay Hamid PhD (2008): “A Computational Framework For Unsupervised Analysis of Everyday Human Activities”

June 18th, 2008 Irfan Essa Posted in Aaron Bobick, Activity Recognition, Numerical Machine Learning, PhD, Raffay Hamid | No Comments »

M. Raffay Hamid PhD (2008), “A Computational Framework For Unsupervised Analysis of Everyday Human Activities“, PhD Thesis, Georgia Institute of Techniology, College of Computing, Atlanta, GA. (Advisor: Aaron Bobick & Irfan Essa)

Abstract

In order to make computers proactive and assistive, we must enable them to perceive, learn, and predict what is happening in their surroundings. This presents us with the challenge of formalizing computational models of everyday human activities. For a majority of environments, the structure of the in situ activities is generally not known a priori. This thesis therefore investigates knowledge representations and manipulation techniques that can facilitate learning of such everyday human activities in a minimally supervised manner. 

A key step towards this end is finding appropriate representations for human activities. We posit that if we chose to describe activities as finite sequences of an appropriate set of events, then the global structure of these activities can be uniquely encoded using their local event sub-sequences. With this perspective at hand, we particularly investigate representations that characterize activities in terms of their fixed and variable length event subsequences. We comparatively analyze these representations in terms of their representational scope, feature cardinality and noise sensitivity.

Exploiting such representations, we propose a computational framework to discover the various activity-classes taking place in an environment. We model these activity-classes as maximally similar activity-cliques in a completely connected graph of activities, and describe how to discover them efficiently. Moreover, we propose methods for finding concise characterizations of these discovered activity-classes, both from a holistic as well as a by-parts perspective. Using such characterizations, we present an incremental method to classify

a new activity instance to one of the discovered activity-classes, and to automatically detect if it is anomalous with respect to the general characteristics of its membership class. Our results show the efficacy of our framework in a variety of everyday environments

Tags: , ,

AddThis Social Bookmark Button

Thesis David Minnen PhD (2008): “Unsupervised Discovery of Activity Primitives from Multivariate Sensor Data”

June 18th, 2008 Irfan Essa Posted in Activity Recognition, David Minnen, PhD, Thad Starner | No Comments »

David Minnen PhD (2008): “Unsupervised Discovery of Activity Primitives from Multivariate Sensor Data“ Georgia Institute of Techniology, College of Computing, Atlanta, GA. (Advisor: Thad Starner & Irfan Essa)

Abstract

This research addresses the problem of temporal pattern discovery in real-valued, multivariate sensor data. Several algorithms were developed, and subsequent evaluation demonstrates that they can efficiently and accurately discover unknown recurring patterns in time series data taken from many different domains. Different data representations and motif models were investigated in order to design an algorithm with an improved balance between run-time and detection accuracy. The different data representations are used to quickly filter large data sets in order to detect potential patterns that form the basis of a more detailed analysis. The representations include global discretization, which can be efficiently analyzed using a suffix tree, local discretization with a corresponding random projection algorithm for locating similar pairs of subsequences, and a density-based detection method that operates on the original, real-valued data. In addition, a new variation of the multivariate motif discovery problem is proposed in which each pattern may span only a subset of the input features. An algorithm that can efficiently discover such “subdimensional” patterns was developed and evaluated. The discovery algorithms are evaluated by measuring the detection accuracy of discovered patterns relative to a set of expected patterns for each data set. The data sets used for evaluation are drawn from a variety of domains including speech, on-body inertial sensors, music, American Sign Language video, and GPS tracks.

Tags: , , ,

AddThis Social Bookmark Button

Interesting: The State of the News Media 2008

April 16th, 2008 Irfan Essa Posted in Computational Journalism, Interesting | No Comments »

The Project for Excellence in Journalism has done an amazing report on “The State of the News Media 2008″
with reference to American Journalism.  It is a very interesting read and very interesting with reference to our efforts on Computation and Journalism.

Tags:

AddThis Social Bookmark Button

Paper: ICASSP (2008) “Discriminative Feature Selection for Hidden Markov Models using Segmental Boosting”

April 3rd, 2008 Irfan Essa Posted in Face and Gesture, Funding, James Rehg, NSF (0205507), Numerical Machine Learning, PAMI/ICCV/CVPR/ECCV, Papers, Pei Yin, Thad Starner | No Comments »

Pei Yin, Irfan Essa, James Rehg, Thad Starner (2008) “Discriminative Feature Selection for Hidden Markov Models using Segmental Boosting”, ICASSP 2008 – March 30 – April 4, 2008 – Las Vegas, Nevada, U.S.A. (Paper: MLSP-P3.D8, Session: Pattern Recognition and Classification II, Time: Thursday, April 3, 15:30 – 17:30, Topic: Machine Learning for Signal Processing: Learning Theory and Modeling) (PDF|Project Site)

ABSTRACT

icassp08We address the feature selection problem for hidden Markov models (HMMs) in sequence classification. Temporal correlation in sequences often causes difficulty in applying feature selection techniques. Inspired by segmental k-means segmentation (SKS), we propose Segmentally Boosted HMMs (SBHMMs), where the state-optimized features are constructed in a segmental and discriminative manner. The contributions are twofold. First, we introduce a novel feature selection algorithm, where the temporal dynamics are decoupled from the static learning procedure by assuming that the sequential data are piecewise independent and identically distributed. Second, we show that the SBHMM consistently improves traditional HMM recognition in various domains. The reduction of error compared to traditional HMMs ranges from 17% to 70% in American Sign Language recognition, human gait identification, lip reading, and speech recognition.

Tags: , ,

AddThis Social Bookmark Button

Event: SIGGRAPH PC Meeting at GA Tech

March 30th, 2008 Irfan Essa Posted in Events, Greg Turk, SIGGRAPH/SCA/NPAR/EG | No Comments »

ACM SIGGRAPH 2008 Paper’s Committee Meeting was held at GA Tech in Atlanta, March 29-30, under the leadership of Greg Turk. Following is a picture of all of us at work, with our sigs, as a note of thanks for Greg

20080330-at-17h07m27-mg-9450bw.jpg

Original Photo by myself, this version with sigs by Fredo Durand.

Tags:

AddThis Social Bookmark Button

Funding: NSF (2008) “Symposium on Computation and Journalism”

March 8th, 2008 Irfan Essa Posted in Computational Journalism, Funding | No Comments »

Award#0813831 – Symposium on Computation and Journalism

ABSTRACT

Fundamentally, journalism is aimed at collecting news information and disseminating that information with a layer of contextualization and understanding provided by journalists. Recent advances in computational technology are rapidly affecting how news information is gathered, reported and distributed. Furthermore, new avenues for aggregating, visualizing, summarizing, consuming, and collaborating on news are increasingly becoming popular and challenging traditional practices of Journalism. Following the success of text search, image and video search questions are now poised to make a bigger impact to journalism and other related fields. Computation and Journalism individually share a deep routed interest in Information, and the value it provides to society. The concept of Information Quality, the measure of the value that the information provides to the user of that information, brings these two disciplines together. In computing and information sciences, information quality is used to describe the degree of excellence in communicating knowledge or intelligence and is composed of different facets such as accuracy, reliability, comprehensiveness, currency, and validity. In journalism, where the conveyance of quality information is paramount, principles such as accuracy, fairness, thoroughness, and transparency guide journalists in communicating quality information. Traditionally, journalism has also entailed an ethos of working on the side of the citizenry to provide them with quality information they need to make informed decisions in the process of their daily lives. However, the plethora of un-vetted blogs, podcasts, videos and other online media, generated by users or by corporations with subjective biases have led to significant compromise in information quality. Collaborative knowledge generation (wikipedia), and citizen journalism, are showing new ways of how information and (global) news can be shared. However, as the Web and the Internet continue to grow and as computing technologies pervade through the planet, a thorough study of the process of journalism and the deep computational aspects of such processes need to be undertaken. To this end, the PI’s research group at Georgia Institute of Technology is interested in understanding how computational advances impact the field of journalism. The long term aim is to make novel contributions by developing computational technologies to better support the goals of journalism. To launch this effort, they are organizing a Symposium on Computation + Journalism at GA Tech, in Atlanta, GA, February 22-23, 2008. The goal of this symposium is to bring together stakeholder from the all aspects of Journalism, Media, and Computation. Participants in panels, presentations and breakout groups will discuss these issues and create a roadmap towards answering these questions that bring together computation and journalism.

Tags: , , ,

AddThis Social Bookmark Button

Personal: Creative Use of Computational Photography and Journalism

March 3rd, 2008 Irfan Essa Posted in Personal | 1 Comment »

Irfan’s Office Hacked

My students decided to play a very nice joke on me. This morning I walked in to find my office open (and it was not!)

Office Open Office Really Open

check out the left image, with door closed and the right image with the door open.

And, then the inside of the office was kinda different too.

Inside of the Office

Tags: , , ,

AddThis Social Bookmark Button

Event: Journalism 3G The Future of Technology in the Field

February 23rd, 2008 Irfan Essa Posted in Computational Journalism, Events, Nick Diakopoulos | No Comments »

Journalism 3G: The Future of Technology in the Field (A Symposium on Computation and Journalism) was a huge success. CJ Logo

  • We had over 230 registered attendees. Thanks to all participants, panelists, and speakers.
  • Use our Social Network (http://cj.crowdvine.com/) to continue the conversation.
  • Join the FACEBOOK group (http://git.facebook.com/group.php?gid=18427444784)
  • Use the tag “CnJ” on all blog posts and photo/video posts on the web, so we can collect them
  • Videos of the event are now available here.

20080223_0351-0355-pano-200p.jpg

Tags: ,

AddThis Social Bookmark Button

Event: Symposium on computation+journalism (Feb 22-23, 2008, Atlanta, GA)

February 15th, 2008 Irfan Essa Posted in Events, Nick Diakopoulos | No Comments »

CJ LogoWorking with Brad Stenger (Wired), Nick Diakopoulos (GA Tech), Sergio Goldenberg (GA Tech), we are organizing a Symposium on computation+journalism, to bring together computationalists, internet/media experts, and journalists together for a series of panels, presentations, and discussion around how computing technologies are effecting (and changing) journalism practices. We have over 180 people registered and it promise to be a great first-of-its-kind event. This event is being hosted by the GVU Center at Georgia Tech.

Tags: ,

AddThis Social Bookmark Button

Event: AAAI 2008 Special Track on Physically-Grounded AI

February 14th, 2008 Irfan Essa Posted in Events, Service | No Comments »

AAAIPGAIcallI am Co-Chairing  with Drew Bagnell (CMU), Wolfram Burgard (University of Frieberg) a Special Track on Physically-Grounded AI. See AAAI-08: Twenty-Third Conference on Artificial Intelligence, Chicago, IL, USA. The goal of this special track is to bring researhers from computer vision, robotics, machine learning and activity recognition to AAAI in a unified forum. All papers in this track will be full AAAI  papers.

We received around 60 submissions to this track and expect a few NECTAR (new scientific and technical advances in research) submissions too (DUE Feb 18, 2008). The primary track submissions are in process of review).

Abstract Submission Deadline: January 25, 2008 *DONE*
Paper Submission Deadline: January 30, 2008 *DONE*
Author Notification Deadline: April 1, 2008

Tags: ,

AddThis Social Bookmark Button

Spring 2008 Term

January 4th, 2008 Irfan Essa Posted in Personal, Research, Teaching | No Comments »

Spring 2008 Term at GA Tech begins Monday 1/7/2009. It will be a busy term with the following activities, in addition to my research related activities.

AddThis Social Bookmark Button

Personal: Read this Book “Three Cups of Tea” By Greg Mortenson and David Oliver Relin

December 4th, 2007 Irfan Essa Posted in Personal | No Comments »

Three Cups of Tea“Three Cups of Tea” By Greg Mortenson and David Oliver Relin
This is a great book. I have just started reading it, but many have recommended it.

Also see

AddThis Social Bookmark Button

Paper: MICCAI (2007) “A Boosted Segmentation Method for Surgical Workflow Analysis”

November 1st, 2007 Irfan Essa Posted in Activity Recognition, Health Systems, Papers, Research | No Comments »

N. Padoy, T. Blum, I. Essa, H. Feußner, M.O. Berger, N. Navab A Boosted Segmentation Method for Surgical Workflow Analysis Proceedings of Medical Image Computing and Computer-Assisted Intervention (MICCAI 2007) (to appear), Brisbane, Australia, Oct. 29 – Nov. 2 2007 (bib)

Abstract

As demands on hospital efficiency increase, there is a stronger need for automatic analysis, recovery, and modification of surgical workflows. Even though most of the previous work has dealt with higher level and hospital-wide workflow including issues like document management, workflow is also an important issue within the surgery room. Its study has a high potential, e.g., for building context-sensitive operating rooms, evaluating and training surgical staff, optimizing surgeries and generating automatic reports. In this paper we propose an approach to segment the surgical workflow into phases based on temporal synchronization of multidimensional state vectors. Our method is evaluated on the example of laparoscopic cholecystectomy with state vectors representing tool usage during the surgeries. The discriminative power of each instrument in regard to each phase is estimated using AdaBoost. A boosted version of the Dynamic Time Warping (DTW) algorithm is used to create a surgical reference model and to segment a newly observed surgery. Full cross-validation on ten surgeries is performed and the method is compared to standard DTW and to Hidden Markov Models.

AddThis Social Bookmark Button

Presentation: CETEE (2007): “Computational Photography & Video: Research & Education”

October 30th, 2007 Irfan Essa Posted in Presentations, Research, Teaching | No Comments »

I was invited to participate and present at the CETEE 2007, Islamabad, November 27-28, 2007.

This meeting has recently been postponed.

AddThis Social Bookmark Button

Paper: Ergonomics in Design (2007), “Designing a Technology Coach”

October 29th, 2007 Irfan Essa Posted in A. Dan Fisk, Activity Recognition, Aware Home, Papers, Wendy Rogers | No Comments »

RogerEssaFisk IconFEATURE AT A GLANCE: Technology in the home environment has the potential to support older adults in a variety of ways. We took an interdisciplinary approach (human factors/ergonomics and computer science) to develop a technology “coach” that could support older adults in learning to use a medical device. Our system provided a computer vision system to track the use of a blood glucose meter and provide users with feedback if they made an error. This research could support the development of an in-home personal assistant to coach individuals in a variety of tasks necessary for independent living.

KEYWORDS: home technology, medical devices, support for learning

Tags: , ,

AddThis Social Bookmark Button

Paper: IEEE Data Mining Conference 2007 “Detecting Subdimensional Motifs: An Efficient Algorithm for Generalized Multivariate Pattern Discovery”

October 28th, 2007 Irfan Essa Posted in Activity Recognition, Charles Isbell, David Minnen, Papers, Research, Thad Starner | No Comments »

D. Minnen, I. Essa, C.L. Isbell, and T. Starner “Detecting Subdimensional Motifs: An Efficient Algorithm for Generalized Multivariate Pattern Discovery” In IEEE Int. Conf. on Data Mining (ICDM) 2007, Omaha, NE, October 28-31, 2007. [PDF]

Abstract

ICDMPaper Discovering recurring patterns in time series data is a fundamental problem for temporal data mining. This paper addresses the problem of locating subdimensional motifs in real-valued, multivariate time series, which requires the simultaneous discovery of sets of recurring patterns along with the corresponding relevant dimensions. While many approaches to motif discovery have been developed, most are restricted to categorical data, univariate time series, or multivariate data in which the temporal patterns span all of the dimensions. In this paper, we present an expected linear-time algorithm that addresses a generalization of multivariate pattern discovery in which each motif may span only a subset of the dimensions. To validate our algorithm, we discuss its theoretical properties and empirically evaluate it using several data sets including synthetic data and motion capture data collected by an on-body inertial sensor.

Tags: ,

AddThis Social Bookmark Button

Poster: ACM UIST (2007) “NARC: The News Article Revision Comparator.”

October 26th, 2007 Irfan Essa Posted in ACM UIST/CHI, Computational Journalism, Nick Diakopoulos | No Comments »

A. St. Clair, M. Fong, N. Diakopoulos, I. Essa. (2007) “NARC: The News Article Revision Comparator.” In Proceedings addendum of User Interface Software Technology (UIST). Newport, Rhode Island, October 2007 [Abstract] [Poster]

ABSTRACT

Currency of information in news consumption is an important facet of information quality which involves both the journalist providing updated information and the consumer being aware of updates and changes to the news stream. We are addressing information quality and currency in online news articles from the viewpoint of news consumption with the intent of reducing the consumption effort involved in getting the most up-to-date information on a breaking news story. The goal of this research is thus to develop a web-based user interface which (1) allows users to easily and quickly see updates to news articles online and (2) blends into existing consumption patterns by integrating into news websites. We have built NARC to address these issues by providing an integrated interface which allows users to quickly perceive changes to news
articles using an inline text visualization.

Tags: ,

AddThis Social Bookmark Button

Awarded the “GVU 15 years of Impact Award”

October 25th, 2007 Irfan Essa Posted in Events, In The News, Research | No Comments »

Jim Foley and Irfan EssaThe Award

Awarded the “GVU 15 years of Impact Award” at GVU 15 Anniversary Celebration and Symposium on October 25, 2007.

Tags:

AddThis Social Bookmark Button

Event: GVU 15 Anniversay Celebration Symposium (2007)

October 25th, 2007 Irfan Essa Posted in Events, Research | No Comments »

GVU 15 Anniversary Celebration and Symposium

GVU will celebrate “15 Years of Impact” at this anniversary symposium on October 25, 2007.

This special day in GVU history will include:

  • Morning keynote addressed by Andy van Dam and Genevieve Bell.
  • Presentation of 15 GVU Impact Awards for the people, projects and programs that have had significant impact on GVU research activities.
  • An afternoon keynote by GVU Director, Elizabeth Mynatt, outlining GVU’s mission and research program.
  • A demo reception where you will witness the latest innovations at the GVU Center.
  • An outdoor bbq in honor of returning GVU alums and faculty.
AddThis Social Bookmark Button

Paper: ICCV 2007, “Structure from Statistics – Unsupervised Activity Analysis using Suffix Trees”

October 15th, 2007 Irfan Essa Posted in Aaron Bobick, Activity Recognition, Aware Home, PAMI/ICCV/CVPR/ECCV, Papers, Raffay Hamid | No Comments »

Abstract

Models of activity structure for unconstrained environments are generally not available a priori. Recent representational approaches to this end are limited by their computational complexity, and ability to capture activity structure only up to some fixed temporal scale. In this work, we propose Suffix Trees as an activity representation to efficiently extract structure of activities by analyzing their constituent event-subsequences over multiple temporal scales. We empirically compare Suffix Trees with some of the previous approaches in terms of feature cardinality, discriminative prowess, noise sensitivity and activity-class discovery. Finally, exploiting properties of Suffix Trees, we present a novel perspective on anomalous subsequences of activities, and propose an algorithm to detect them in linear-time. We present comparative results over experimental data, collected from a kitchen environment to demonstrate the competence of our proposed framework.

Tags: , , ,

AddThis Social Bookmark Button

Thesis: Mitch Parry PhD (2007), “Separation and Analysis of Multichannel Signals”

October 9th, 2007 Irfan Essa Posted in Audio Analysis, Funding, Mitch Parry, NSF (0205507), PhD, Thesis | No Comments »

Mitch Parry (2007), Separation and Analysis of Multichannel Signals PhD Thesis [PDF], Georgia Institute of Techniology, College of Computing, Atlanta, GA. (Advisor: Irfan Essa)

Abstract

This thesis examines a large and growing class of digital signals that capture the combined effect of multiple underlying factors. In order to better understand these signals, we would like to separate and analyze the underlying factors independently. Although source separation applies to a wide variety of signals, this thesis focuses on separating individual instruments from a musical recording. In particular, we propose novel algorithms for separating instrument recordings given only their mixture. When the number of source signals does not exceed the number of mixture signals, we focus on a subclass of source separation algorithms based on joint diagonalization. Each approach leverages a different form of source structure. We introduce repetitive structure as an alternative that leverages unique repetition patterns in music and compare its performance against the other techniques.

When the number of source signals exceeds the number of mixtures (i.e., the underdetermined problem), we focus on spectrogram factorization techniques for source separation. We extend single-channel techniques to utilize the additional spatial information in multichannel recordings, and use phase information to improve the estimation of the underlying components.

Tags: , , ,

AddThis Social Bookmark Button

Presentation: U of Maryland: “Computational Photography and Video: Spatio Temporal Analysis for Synthesis”

September 25th, 2007 Irfan Essa Posted in Computational Photography and Video, Presentations | No Comments »

Computational Photography and Video: Spatio Temporal Analysis for Synthesis of Novel Images and Videos.

ABSTRACT

Digital image capture and processing has recently had a significant impact on the computer graphics quest for rendering novel scenes. In this talk, I will present an overview of series of ongoing efforts in the analysis of images and videos for rendering novel scenes. First I will discuss (in brief) our work on Video Textures, where repeating information is extracted to generate extended sequences of videos. I will then describe some our extensions to this approach that allows for controlled generation of animations of video sprites. We have developed various learning and optimization techniques that allow for video-based animations of photo-realistic characters. Then I will describe additional approaches for image and video synthesis that builds on optimal patch-based copying of samples. I will show how our methods allow for iterative refinement, with a variety of optimization criteria, and all for extension to synthesis of both images and video from very limited samples. Using these sets of approaches as a foundation, then I will show how new images and videos can be generated. I will show examples of Photorealistic and Non-photorealistic Renderings of Scenes (Videos and Images) and how these methods support the media reuse culture, so common these days with user generated content. Time permitting, I will also share some of our efforts on video annotation and how we have taken some of these new concepts of video analysis to undergraduate classrooms.

AddThis Social Bookmark Button