Computer Scientists Developing Technology To Improve Data Mining For Homeland Security

From online news articles to blogs, a massive amount of information is voluntarily being put before the public every day.

Some of this information may be valuable to protecting homeland security. However, to sift through this readily available content and summarize it for agencies like the Department of Homeland Security, analysts need to do more than sit at a computer, entering words like "al-Quaida" into Internet search engines.

That's why Kansas State University's William Hsu and other computer scientists who research data mining are part of a project to develop technology that makes automated Internet searches more useful and productive.

"We're helping to develop the next generation of Web search and crawling," Hsu said. "Our goal is to develop a research program that will help with homeland security. The Department of Homeland Security wants to pull information that's available to anyone in the public domain, like millions of articles from sources like CNN and Al-Jazeera, and monitor them for security."

Hsu is an associate professor of computer and information sciences, head of K-State's Laboratory for Knowledge Discovery in Databases, and co-principal investigator of a Department of Homeland Security-funded summer institute aimed at training future researchers in data sciences. The $2.4 million Data Sciences Summer Institute, headed by the University of Illinois along with K-State and the University of Texas San Antonio, is titled "Multimodal Information Access and Synthesis." The Illinois-led cooperative is one of four such University Affiliate Centers nationwide.

Data mining is a way of processing vast amounts of information and putting it in multiple, useful formats. Hsu's data mining research at K-State includes applications in fields like genome analysis, nanoscale materials modeling and diagnostic medicine. The work at K-State that will benefit homeland security strives to resolve ambiguity in Internet searches. For instance, this would allow a search engine to differentiate between homeland security as a concept and Homeland Security as a government agency. Hsu said that one of the institute's projects aims to improve name recognition, a heavily studied problem in information extraction.

"The goal is to develop an automated system that can pick out al-Quaida as an organization, Kandahar as a place and Osama bin Laden as a person, based upon rules developed from previously-seen documents," Hsu said. "Subcategories are a problem," he said. "'People' is a big tag. Is this a head of state? A celebrity? Someone who was interviewed?"

Data mining research at K-State and collaborating institutions is helping solve another problem with getting information off the Internet -- inefficient crawling. Hsu said search engines provide up-to-date results by first looking through vast numbers of Web pages and archiving them in a process called crawling. Hsu said the project leader, Kevin Chang at the University of Illinois, describes the problem with this process as "crawling in the dark -- you start somewhere and grab everything." Hsu said research in this area will lead to better searches whereby search engines can anticipate keywords, for instance. Search engines also could create virtual neighborhoods of information in which connections are made among bits of information based on the results of similar searches.

Although text-based searches have their complications, Hsu said searching for images is even harder because searches rely on the words people use to describe the images, such as a photo caption. Data mining research at K-State and its partner institutions is leading to technology that will allow search engines to "look" through images from the Web. Hsu said search engines would sift through images that are automatically annotated, or marked up, to describe their contents. This would be done using tools that analyze the shape, border, color and orientation of objects, among many other features, to pick out, for instance, an image of George W. Bush in a press conference photo.

"Computers will figure out an image identity by 'seeing' a feature that all such images have in common," Hsu said.

The next generation of data mining research, Hsu said, will involve computer scientists working with social scientists. By scouring news articles and other public data, researchers can work on something called sentiment analysis.

"Sometimes Homeland Security just needs to know, for instance, what the local reaction is to a particular event such as a bomb threat or recent explosion," Hsu said.

Featured

  • 2025 Security LeadHER Conference Program Announced

    ASIS International and the Security Industry Association (SIA) – the leading membership associations for the security industry – have announced details for the 2025 Security LeadHER conference, a special event dedicated to advancing, connecting and empowering women in the security profession. The third annual Security LeadHER conference will be held Monday, June 9 – Tuesday, June 10, 2025, at the Detroit Marriott Renaissance Center in Detroit, Michigan. This carefully crafted program represents a comprehensive professional development opportunity for women in security this year. To view the full lineup at this year’s event, please visit securityleadher.org. Read Now

    • Industry Events
  • Report: 82 Percent of Phishing Emails Used AI

    KnowBe4, the world-renowned cybersecurity platform that comprehensively addresses human risk management, today launched its Phishing Threat Trend Report, detailing key trends, new data, and threat intelligence insights surrounding phishing threats targeting organizations at the start of 2025. Read Now

  • NRF Supports Federal Bill to Thwart Retail Crime

    The National Retail Federation recently announced its support for the Combating Organized Retail Crime Act of 2025. The act was introduced by Chairman Chuck Grassley, R-Iowa, Senator Catherine Cortez Masto, D-Nev., and Representative Dave Joyce, R-Ohio. Read Now

  • ISC West 2025 Brings Almost 29,000 Industry Professionals to Las Vegas

    ISC West 2025, organized by RX and in collaboration with the Security Industry Association, concluded at the Venetian Expo in Las Vegas last week. The nation’s leading comprehensive and converged security event attracted nearly 29,000 industry professionals and left a lasting impression on the global security community. Over five action-packed days, ISC West welcomed more than 19,000 attendees and featured 750 exhibiting brands. Read Now

    • Industry Events
    • ISC West
  • Tradeshow Work Can Be Fun

    While at ISC West last week, I ran into numerous friends and associates all of which was a pleasant experience. The first question always seemed to be, “How many does this make for you?” Read Now

    • Industry Events
    • ISC West

New Products

  • Luma x20

    Luma x20

    Snap One has announced its popular Luma x20 family of surveillance products now offers even greater security and privacy for home and business owners across the globe by giving them full control over integrators’ system access to view live and recorded video. According to Snap One Product Manager Derek Webb, the new “customer handoff” feature provides enhanced user control after initial installation, allowing the owners to have total privacy while also making it easy to reinstate integrator access when maintenance or assistance is required. This new feature is now available to all Luma x20 users globally. “The Luma x20 family of surveillance solutions provides excellent image and audio capture, and with the new customer handoff feature, it now offers absolute privacy for camera feeds and recordings,” Webb said. “With notifications and integrator access controlled through the powerful OvrC remote system management platform, it’s easy for integrators to give their clients full control of their footage and then to get temporary access from the client for any troubleshooting needs.”

  • A8V MIND

    A8V MIND

    Hexagon’s Geosystems presents a portable version of its Accur8vision detection system. A rugged all-in-one solution, the A8V MIND (Mobile Intrusion Detection) is designed to provide flexible protection of critical outdoor infrastructure and objects. Hexagon’s Accur8vision is a volumetric detection system that employs LiDAR technology to safeguard entire areas. Whenever it detects movement in a specified zone, it automatically differentiates a threat from a nonthreat, and immediately notifies security staff if necessary. Person detection is carried out within a radius of 80 meters from this device. Connected remotely via a portable computer device, it enables remote surveillance and does not depend on security staff patrolling the area.

  • Camden CV-7600 High Security Card Readers

    Camden CV-7600 High Security Card Readers

    Camden Door Controls has relaunched its CV-7600 card readers in response to growing market demand for a more secure alternative to standard proximity credentials that can be easily cloned. CV-7600 readers support MIFARE DESFire EV1 & EV2 encryption technology credentials, making them virtually clone-proof and highly secure.