Monday, March 30, 2015

Regular expression to extract the src tag details from given image tag or html source text


By using the regular expression we can extract the any data from the given input, here we are trying to get the src attribute value from the given image tag or html text data.

Image 'src' attribute extract Regx:

<img[^>]*src=[\\\"']([^\\\"^']*)


sample code:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 *
 * @author Uttesh Kumar T.H.
 */
public class ImgTest {

    public static void main(String[] args) {

        String s = "<p><img src=\"38220.png\" alt=\"test\" title=\"test\" /> <img src=\"32222.png\" alt=\"test\" title=\"test\" /></p>";
        Pattern p = Pattern.compile("<img[^>]*src=[\\\"']([^\\\"^']*)");
        Matcher m = p.matcher(s);
        while (m.find()) {
            String src = m.group();
            int startIndex = src.indexOf("src=") + 5;
            String srcTag = src.substring(startIndex, src.length());
            System.out.println(srcTag);
        }
    }

}


Related Posts:

  • Find Number of duplicates present in a ArrayList To find the number of occurrences of duplicate integer present in List, first we will create temp List which will have no duplicates and then use collection frequecy() function to get the occurrence of duplicate number in or… Read More
  • Screen capture by Java robot classCame across a rare util class of java "Robot" class, which provie the control/auto typing of letter/keyboard controll and screen capering. sample code … Read More
  • Is in Daylight Saving Time (DST)? for given date and timezone Daylight Saving Time (DST) is the practice of turning the clock ahead as warmer weather approaches and back as it becomes colder again so that people will have one more hour of daylight in the afternoon and evening during th… Read More
  • Whois domain detail by javaApache commmon org.apache.commons.net.whois.WhoisClient class provides the domain details. Following details: Domain Name: UTTESH.COM Registrar: GODADDY.COM, LLC Sponsoring Registrar IANA ID: 146 Whois Server: whois.godaddy… Read More
  • Timezone Conversion with Daylight savingTimezone conversion from one timezone to other timezone for the given time or hour string input. While converting the timezone we have to consider DST standard also. in jdk 1.3. and 1.4 DST was not taken care internally by j… Read More

1 comment: