Tags: backslashes, essentials, file, fileseparator, folks, groups, java, multiple, replacing, separator, slashes, value

Replace multiple backslashes and forward slashes with File.separator

On Java Studio » Java Essentials

4,621 words with 9 Comments; publish: Mon, 24 Sep 2007 05:52:00 GMT; (15062.50, « »)

Hi folks,

i'm looking for a way of replacing groups of backslashes and forward slashes with the File.separator value.

For example

com\\\\string/some///other////package

would be converted to

com\string\some\other\package

on Windows.

I tried splitting it into two passes one for backslashes and one for forward

but no joy.

String path = "com\\\\string/some///other////package "

String pass1= path.replaceAll( "\\*" , File.separator);

String pass2= pass1.replaceAll( "/" , File.separator);

System.out.println (pass2);

but the forward slash pass throws an out of index exception

And by removing the second pass and just working with the first pass

instead of removing groups of backslashes it just replaces all the

backslashes in the string with backslashes i.e. no visible effect.

If anyone has any idea as to how to fix this that would be great.

I've got it working with a StringTokenizer and the delimiter string "\\/"

but I'd like to figure out why the regex didn't work..

Thanks,

Mark.

All Comments

Leave a comment...

  • 9 Comments
    • "\\*" should be "\\\\+" (one or more backslashes. You need 4--yes, 4--backslashes in a Java string literal to get a single literal backslash in a regex.)

      File.separator could contain \ which need to be escaped before regex sees it, so you'd probably need to do something likeString sep = System.getProperty("file.separator".replaceAll("\\\\", "\\\\\\\\");

      and then use sep where you're using File.separator

      Or something like that.

      Why do you need to do this though?

      #1; Mon, 16 Jul 2007 02:02:00 GMT
    • The correct regex for this is "[/\\\\]+". That's a character class which matches either a forward slash or a backslash, and the plus sign causes it to match one or more. Using this regex, you only have to make one pass. As targaryen pointed out, backslashes are also special in the replacement string, so they have to be escaped. Here's another way to do that:

      path = path.replaceAll("[/\\\\]+", "\\" + File.separator);

      #2; Mon, 16 Jul 2007 02:02:00 GMT
    • Thanks for the refinement uncle_alice.
      #3; Mon, 16 Jul 2007 02:02:00 GMT
    • Hey Targaryen,

      Thanks a million for this. I can't believe it was quite so hard to find?

      I'm attempting to write a static path sanitizer method that will take in paths

      that may look something like this

      com///sun/\\net//misc\\\\\classA.class

      and convert it to

      com\sun\net\misc\classA.class.

      I'll give that a bash now.

      Cheers,

      Mark.

      #4; Mon, 16 Jul 2007 02:02:00 GMT
    • Hi uncle_alice, Thanks also for your help. I'd never have figured that out in a million years.Why is it so complicated? Mark.
      #5; Mon, 16 Jul 2007 02:02:00 GMT
    • here's a couple of interesting sites about regular expressions

      and the split method is quite handy

      [url]http://www.regular-expressions.info/[/url]

      [url]http://www.amk.ca/python/howto/regex/[/url]

      to answer your question about complicated,

      it seems that way to us but it's simple for uncle_alice

      kind regards

      Walken16

      #6; Mon, 16 Jul 2007 02:02:00 GMT
    • .javaessentials.developerfaqs.com.mark_gargan_78:

      I hope this slash conversion voodoo is actually necessary -- only necessary if the paths are being written into a script or otherwise externalized to the OS. If you're doing this just to use when constructing a FileInputStream for example, you're doing too much work. Just use forward slashes in your Java code and it will translate them into the platform-specific path names for you.

      #7; Mon, 16 Jul 2007 02:02:00 GMT
    • > Why is it so complicated?

      If you mean why so many backslashes, it's because regexes and String literals both use the backslash as an escape character. If you want the regex compiler to see one backslash, you have to put two in the String literal. But if you want to match a backslash, you have to escape it with another backslash, which means putting four in the String literal.

      In the second argument to replaceAll(), dollar signs are special because you can embed group references like $0, $1, etc. in the string. To insert a literal '$' in the result, you have to escape it with a backslash. That means literal backslashes also have to be escaped, so it's four for one in the second argument, too.

      #8; Mon, 16 Jul 2007 02:02:00 GMT
    • Thanks a million Uncle_alice.This works perfectly for me now. Thanks for the explanation as to the use of all the backslashes in the replace string.Mark.
      #9; Mon, 16 Jul 2007 02:02:00 GMT