Java Tag Content Extractor - Hacker Rank Solution

Java Tag Content Extractor - Hacker Rank Solution

Hello Friends, How are you? Today I am going to solve the HackerRank Java Tag Content Extractor Problem with a very easy explanation. This is the 24th problem of Java on HackerRank. In this article, you will get more than one approach to solving this problem. So let's start-

{tocify} $title={Table of Contents}

Tag Content Extractor - Hacker Rank Solution in Java


HackerRank Java Tag Content Extractor - Problem Statement

In a tag-based language like XML or HTML, contents are enclosed between a start tag and an end tag like <tag>contents</tag>. Note that the corresponding end tag starts with a /.

Given a string of text in a tag-based language, parse this text and retrieve the contents enclosed within sequences of well-organized tags meeting the following criterion:

  1. The name of the start and end tags must be the same. The HTML code <h1>Hello World</h2> is not valid, because the text starts with an h1 tag and ends with a non-matching h2 tag.
  2. Tags can be nested, but content between nested tags is considered not valid. For example, in <h1><a>contents</a>invalid</h1>, contents is valid but invalid is not valid.
  3. Tags can consist of any printable characters.

Input Format

The first line of input contains a single integer, N (the number of lines).
The N subsequent lines each contain a line of text.

Constraints

  • Each line contains a maximum of 10^4 printable characters.
  • The total number of characters in all test cases will not exceed 10^6.
  • 1 <= N <= 100

Output Format

4 <h1>Nayeem loves counseling</h1> <h1><h1>Sanjay has no watch</h1></h1><par>So wait for a while</par> <Amee>safat codes like a ninja</amee> <SA premium>Imtiaz has a secret crush</SA premium>{codeBox}

Sample Output

Nayeem loves counseling Sanjay has no watch So wait for a while None Imtiaz has a secret crush {codeBox}

Java Tag Content Extractor - Hacker Rank Solution

Approach I:

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.stream.Collectors;
import java.util.stream.IntStream;


public class Solution
{
    public static void main(String[] args) throws IOException {
        final Pattern tagRE = Pattern.compile("<([^/>]+)>([^<>]+)</\\1>");

        BufferedReader br = new BufferedReader(new InputStreamReader(System.in), 64 * 1024);

        final int T = Integer.parseInt(br.readLine().trim(), 10);

        for (int t = 0; t < T; t++) {
            final String line = br.readLine().trim();

            final List<String> res = new ArrayList<>();
            Matcher match = tagRE.matcher(line);

            while (match.find()) {
                res.add(match.group(2));
            }

            if (res.size() == 0) {
                System.out.println("None");
            } else {
                System.out.println(res.stream().collect(Collectors.joining("\n")));
            }
        }

        br.close();
        br = null;
    }
}


Approach II:

import java.io.*;
import java.util.*;
import java.text.*;
import java.math.*;
import java.util.regex.*;

public class Solution {

    private static String tagReg = "<(.+)>([^<]+)</\\1>";
    private static Pattern tagPattern = Pattern.compile(tagReg);     
    
    public static void main(String[] args) {
      
        Scanner in = new Scanner(System.in);
        int testCases = Integer.parseInt(in.nextLine());
      
        while (testCases > 0) {
            String line = in.nextLine();
         
            Matcher tagMatcher = tagPattern.matcher(line);
            if (tagMatcher.find()) {
                do {
                    System.out.println(tagMatcher.group(2));             
                } while (tagMatcher.find());
            } else {
                System.out.println("None");
            }
            testCases--;
       }
   }
}


Approach III:

import java.io.*;
import java.util.*;
import java.text.*;
import java.math.*;
import java.util.regex.*;

public class Solution{
   public static void main(String[] args){
      
       Pattern pattern = Pattern.compile("<([^>]+)>([^<]+)</\\1>");
       
      Scanner in = new Scanner(System.in);
      int testCases = Integer.parseInt(in.nextLine());
      while(testCases>0){
         String line = in.nextLine();
         Matcher m = pattern.matcher(line);
          int matches = 0;
          while(m.find()) {
              matches++;
              System.out.println(m.group(2));
          }
          if(matches == 0) {
              System.out.println("None");
          }
         
         testCases--;
      }
   }
}

Also Check:

Disclaimer: The above Problem ( Java Tag Content Extractor ) is generated by Hackerrank but the Solution is Provided by MyEduWaves. This tutorial is only for Educational and Learning purposes. Authority if any of the queries regarding this post or website fill the contact form.

I hope you have understood the solution to this HackerRank Problem. All these solutions will pass all the test cases. Now visit Java Tag Content Extractor HackerRank Problem and try to solve it again.

All the Best!

Post a Comment

Previous Post Next Post