[NCLUG] sed help ?

Tkil tkil at scrye.com
Sun Mar 2 00:39:22 MST 2003


>>>>> "Gabriel" == Gabriel L Somlo <somlo at acns.colostate.edu> writes:

Gabriel> However, from looking at the file I'm editing again, it turns
Gabriel> out I need to remove the matching line, the line immediately
Gabriel> before it, and the line immediately after it... This
Gabriel> immediately eliminates any 'streaming' or 'filter'
Gabriel> approach... :(

Not really; the filter just has to be a bit smarter:

  perl -lnwe 'if    (/pattern/) { $skip = 1;
                                  $last = undef }
              elsif ( $skip )   { $skip = 0 }
              else              { print $last if defined $last;
                                  $last = $_ }
              END               { print $last }'

Note that this is definitely getting to the point where I'd most
likely put this in a script, and not have insanity like this on my
command line anymore.  :)

This is another adventure in program specification, as well.  Now you
have to address these questions, too:

1. What do you do when /pattern/ matches the first line in the file?
   The last?

2. As well as worrying about two consecutive lines matching /pattern/,
   you also now need to worry about /pattern/ matching two lines
   seperated by one line.

   Meaning, if you had this text:

      one
      two
      three
      pattern
      five
      pattern
      seven
      eight
      nine
      ten

   What would you expect / require the result to be?  I see two
   "reasonable" interpretations; the first is more local, and I call
   it the "streaming" view -- you really care only about the lines
   literally neighboring the pattern.  The output of this filter would
   be:

      one
      two
      eight
      nine
      ten

   The other view is more "iterative"; after you remove any set of
   line / matching-line / line, re-match against the entire file.  The
   first pass would leave us with:

      one
      two
      pattern
      seven
      eight
      nine
      ten

   Since we matched, we have to iterate over the entire file again,
   yielding:

      one
      eight
      nine
      ten

   Needless to say, my filter (above) implements the "easier"
   streaming model.

Gabriel> Turns out good old 'ed' is what really does the job:

Gabriel> 	echo -e "/some_pattern\n-1,+1d\nw" |
Gabriel> 	  ed my_file.html > /dev/null 2>&1

Gabriel> It's doing the edit 'in place' instead of streaming, but
Gabriel> that's really OK given that I need to go back and forth
Gabriel> inside the file to get what I want...

If you really want in-place editing, perl emulates it with the "-i"
flag.  (To be honest, though, i don't know how well it interacts with
the END block above!)

Glad to hear that 'ed' worked for you.  If you ever need to hammer on
a huge file, though, remember that you can still apply streaming
methods, you just have to have (more) stateful filters.

t.

p.s. I just saw this particularly flagrant abuse of 'sed' on the
     linux-kernel mailing list.  If I understood 'sed' well enough to
     figure it out, i'd probably translate it into perl.  :)

        http://www.uwsg.iu.edu/hypermail/linux/kernel/0303.0/0177.html

     To Mr. Owen's credit, it's quite clearly documented -- but even
     with his documentation, and 10+ years of writing perl, I looked
     at the actual 'sed' commands and went "oof!"



More information about the NCLUG mailing list