5

I have a grep puzzle that's eluding me: I'd like to remove the text following the final period in a collection of strings (i am using R, so perl syntax is available).

For example, say the string is ABCD.txt this grep would return ABCD, and if the text was abc.com.foo.bar, it would return abc.com.foo.

Any help greatfully appreciated (i don't think i can drink any more coffee!).

4 Answers 4

10

Here are a few solutions:

sub("^(.*)[.].*", "\\1", "abc.com.foo.bar") # 1
## [1] "abc.com.foo"

library(tools)
file_path_sans_ext("abc.com.foo.bar") # 3
## [1] "abc.com.foo"

ADDED. Regarding your comment asking to remove leading periods, simplest is to just feed this into any of the above where x is the input string:

sub("^[.]*", "", x)

To do any of them in one line:

x <- c("abc.com.foo.bar", ".abc.com.foo.bar", ".vimrc")

sub("^[.]*(.*)[.]?.*$", "\\1", x) # 1a
## [1] "abc.com.foo.bar" "abc.com.foo.bar" "vimrc"          

file_path_sans_ext(sub("^[.]*", "", x))
## [1] "abc.com.foo" "abc.com.foo" "vimrc" 
8
  • is it too much to ask for a version that also trims leading periods? such that .vimrc becomes vimrc? (sorry, i didn't realise this case until you solved my major problem).
    – ricardo
    Commented Jul 25, 2013 at 1:19
  • 1
    add \\. after the ^.
    – Justin
    Commented Jul 25, 2013 at 1:20
  • @G.Grothendieck: Thanks for another opportunity to upvote your insightful contributions. You taught me most of what I know about R-regex by way of your many postings to Rhelp.
    – IRTFM
    Commented Jul 25, 2013 at 1:25
  • @Justin -- thanks so much. working perfectly now. wish i'd asked earlier.
    – ricardo
    Commented Jul 25, 2013 at 1:25
  • why do you show an example with abc.foo.bar (#2)? it's definitly not what OP want (and actually it's useless for everyone)
    – vladkras
    Commented Jul 25, 2013 at 1:29
3

And a non-regex answer for no reason whatsoever:

test <- c("abc.com.foo.bar","ABCD.txt")
sapply(strsplit(test,"\\."), function(x) paste0(head(x,-1),collapse=".") )
#[1] "abc.com.foo" "ABCD"
1
  • 1
    To be completely accurate this is a simpler regex rather than a non-regex solution as "\\." is a regex. Using strsplit(test, ".", fixed = TRUE) would be a non-regex solution. Commented Nov 14, 2016 at 15:02
2

You can use sub for example like this:

sub('(.*)[.](.*)','\\1',c('abc.com.foo.bar','ABCD.txt'))
[1] "abc.com.foo" "ABCD"  
0
1

I cannot help you with r and I almost forgot perl, but this works both in JS (proof) and PHP

/\.[A-Za-z]+$/     -->    replace this with empty string ""
  ^    ^    ^
  |    |    |
  |    |    end of line
  |    only chars (you can add 0-9 if numbers are also present)
  dot before last chars

the syntax of regex is rather common, so I'm sure you can adopt it (maybe just get rid of /)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.