In Files

Parent

YALTools::YaDocs

YALTools::YaDocs is designed to process huge amount of documents through the CouchDB REST API.

For the performance, the page variable is used to calculate skip and limit query variables.

page starts from one...

If there are seven documents on the example db and the limit query option is set to three, then to get the all documents, the uri will be generated in the following manner;

  /example/_all_docs?limit=3&skip=0  => {..,"rows":[{"_id":..},{"_id":..},{"_id":..}]}
  /example/_all_docs?limit=3&skip=3  => {..,"rows":[{"_id":..},{"_id":..},{"_id":..}]}
  /example/_all_docs?limit=3&skip=6  => {..,"rows":[{"_id":..}]}

Corresponding code is the following;

  view = YALTools::YaAllDocs.new(@couch, "example")
  q_opts = { "include_docs" => "true" } ## or some query options
  view.each(q_opts, 0, 3) do |resset, skip, page, max_page, max_rows|
     resset.each do |row|
       ...
     end
  end

The number of page starts from one and the max_page variable will be set to three.

Please refer the unittest/ut.yalt.yaview.rb file about available query options.

Note

YALTools::YaDocs is an abstract class.

Please use YALTools::YaViewDocs and YALTools::YaAllDocs classes for acutual use.

Attributes

debug[RW]

Public Class Methods

new(couch, dbname) click to toggle source
    # File yalt/yaview.rb, line 42
42:     def initialize(couch, dbname)
43:       @couch = couch
44:       @dbname = dbname
45:       @debug = @couch.debug if @couch.respond_to?("debug")
46:       
47:       @default_query_options = {}
48:     end

Public Instance Methods

each(query_options={}, start_page=0, limit=15) click to toggle source

yields [rows, skip, page, max_page, max_rows].

The query_options must be proper view query options,

     # File yalt/yaview.rb, line 132
132:     def each(query_options={}, start_page=0, limit=15) # :yields: rows, skip, page, max_page, max_rows
133:       pages(@default_query_options.merge(query_options), start_page, limit, true, false) do |rows, skip, page, max_page ,max_rows|
134:         yield [rows, skip, page, max_page ,max_rows]
135:       end
136:     end
each_with_attachments(query_options={}, start_page=0, limit=15) click to toggle source

yields [rows, skip, page, max_page, max_rows] with attachment documents. It might be too slow.

The query_options must be proper view query options,

     # File yalt/yaview.rb, line 143
143:     def each_with_attachments(query_options={}, start_page=0, limit=15) # :yields: rows, skip, page, max_page, max_rows
144:       pages(@default_query_options.merge(query_options), start_page, limit, true, true) do |rows, skip, page, max_page ,max_rows|
145:         yield [rows, skip, page, max_page ,max_rows]
146:       end
147:     end
get_all(query_options={}, current_page=1, limit=15) click to toggle source

yields [rows, page, next_page_flag]

rows

an instance of YALTools::YaJsonRows.

page

the number of the current page.

next_page

true if next page exists.

     # File yalt/yaview.rb, line 98
 98:     def get_all(query_options={}, current_page=1, limit=15) # :yields: rows,page,next_page_flag
 99:       opts = @default_query_options.merge(query_options)
100:       page = current_page.to_i
101:       page = 1 if page < 1
102:       while true
103:         opts["skip"] = limit * (page - 1)
104:         opts["limit"] = limit + 1
105:         uri = gen_view_uri(opts)
106:         $stderr.puts "[debug] get_all() uri=#{uri}" if @debug
107:         
108:         rows = YALTools::YaJsonRows.new(@couch, @dbname)
109:         json = @couch.get(uri)
110:         i=0
111:         next_row = nil
112:         next_page_flag = false
113:         json.has_key?("rows") and yield_rows(json["rows"]) do |r|
114:           if i == limit
115:             next_page_flag = true
116:           else
117:             rows << r
118:           end
119:           i += 1
120:         end
121:         break if rows.length == 0
122:         yield [rows, page, next_page_flag]
123:         break if next_page_flag == false
124:         page += 1
125:       end
126:     end
max_numrows(options={}) click to toggle source

options must be proper view query option values.

  options = {"group" => true}
  max_numrows(options) #=> 30

Decision Table (*:can be ommitted)

 reduce:
  _count | group | gropu_numrows | startkey/endkey/key | reduce | skip | limit |
 --------+-------+---------------+---------------------+--------+------+-------+
    yes  |   on  |       on      |          off        |  true* |  del |  del  |
    yes  |   on  |       on      |           on        |  true* |  del |  del  |
    yes  |  off  |      off      |           on        |  true  |  del |  del  |
    yes  |  off  |      off      |          off        | false  |  del |    0  |
     no  |  off  |      off      |          off        | false  |  del |    0  |
     no  |  off  |      off      |           on        | false  |  del |  del  |
    # File yalt/yaview.rb, line 66
66:     def max_numrows(options={})
67:       opts = {}
68:       
69:       if options.has_key?("startkey") or options.has_key?("endkey") or options.has_key?("key")
70:         opts["startkey"] = options["startkey"] if options.has_key?("startkey") and not options["startkey"] == nil
71:         opts["endkey"] = options["endkey"] if options.has_key?("endkey") and not options["endkey"] == nil
72:         opts["key"] = options["key"] if options.has_key?("key") and not options["key"] == nil
73:       else
74:         opts["reduce"] = "false"
75:         opts["limit"] = "0"
76:       end
77:       if options.has_key?("group") and options["group"].to_s == "true"
78:         opts["group"] = "true"
79:         opts["group_numrows"] = "true"
80: 
81:         opts.delete("reduce")
82:         opts.delete("limit")
83:       end
84:       opts.delete("include_docs")
85:       
86:       uri = gen_view_uri(opts)
87:       $stderr.puts "[debug] max_numrows() uri=#{uri}" if @debug
88:       
89:       return total_numrows(@couch.get(uri), opts)
90:     end
page(query_options={}, page_num=0, limit=15) click to toggle source

returns [rows, skip, page, max_page, max_rows]. It is same as the yielded variables at YaDocs::each.

It returns just a result specified by the page variable.

     # File yalt/yaview.rb, line 154
154:     def page(query_options={}, page_num=0, limit=15)
155:       return pages(@default_query_options.merge(query_options), page_num, limit, false, false)
156:     end

Private Instance Methods

gen_uri_with_options(baseuri, opts) click to toggle source

returns the uri from baseuri and opts.

  gen_uri_with_options("/foo/path", {"k"=>"v","k2"=>"v2"}) #=> "/foo/path?k=v&k2=v2"
     # File yalt/yaview.rb, line 290
290:     def gen_uri_with_options(baseuri, opts)
291:       uri = baseuri
292:       uri += "?" if uri !~ /\?$|\&$/
293:       tmp_list = []
294:       opts.kind_of?(Hash) and opts.each do |k,v|
295:         next if v == nil or (v.respond_to?("empty?") and v.empty?)
296:         tmp_list << "#{k}=#{v}"
297:       end
298:       uri += tmp_list.join("&")
299:       return uri
300:     end
gen_view_uri(opts={}) click to toggle source

returns uri string, such as “/example/_design/all/_view/type?reduce=false&include_docs=true“

     # File yalt/yaview.rb, line 239
239:     def gen_view_uri(opts={})
240:       uri = format("/%s/_all_docs", @dbname)
241:       
242:       msg = { "uri" => uri } and $stderr.puts msg.to_json if @debug
243:       
244:       return gen_uri_with_options(uri, opts)
245:     end
get_page_with_attachment(doc) click to toggle source

returns Hash object of the given document

doc

doc == { “_id”=>“xxx”, “key1”=>“val1”, “key2”=>“val2” }

     # File yalt/yaview.rb, line 230
230:     def get_page_with_attachment(doc)
231:       $stderr.puts "[debug] get_page_with_attachment(doc=#{doc})" if @debug
232:       id = doc["_id"] if doc.has_key?("_id")
233:       uri = "/#{@dbname}/#{id}?attachments=true"
234:       $stderr.puts "[debug] get_page_with_attachment() uri=#{uri}" if @debug
235:       return @couch.get(uri)
236:     end
pages(options={}, page=0, limit=15, do_iterate=false, with_attachments=false) click to toggle source

returns or yield YALTools::YaJsonRows instance and some informational variables.

     # File yalt/yaview.rb, line 163
163:     def pages(options={}, page=0, limit=15, do_iterate=false, with_attachments=false)
164:       $stderr.puts "[debug] pages(options=#{options}, page=#{page}, limit=#{limit}, do_iterate=#{do_iterate})"  if @debug
165:       opts = options.dup
166:       max_rows = max_numrows(opts)
167:       $stderr.puts "[debug] pages() max_rows=#{max_rows}" if @debug
168:       
169:       opts["limit"] = limit
170:       if options.has_key?("group") and options["group"].to_s == "true"
171:         opts.delete("reduce")
172:         opts.delete("include_docs")
173:       else
174:         opts.delete("group")
175:         opts["reduce"] = "false"
176:       end
177:       
178:       ## type 1
179:       yield_skip_page(limit, max_rows, page) do |i_limit, skip, current_page, max_page|
180:         opts["skip"] = skip
181:         opts["limit"] = i_limit
182:         uri = gen_view_uri(opts)
183:         $stderr.puts "[debug] pages() uri=#{uri}" if @debug
184:         
185:         resset = YALTools::YaJsonRows.new(@couch, @dbname)
186:         json = @couch.get(uri)
187:         json.has_key?("rows") and yield_rows(json["rows"]) do |doc|
188:           if with_attachments and doc.has_key?("_attachments")
189:             resset << get_page_with_attachment(doc)  
190:           else
191:             resset << doc
192:           end
193:         end
194:         if do_iterate
195:           yield [resset, skip, current_page, max_page ,max_rows]
196:         else
197:           return [resset, skip, current_page, max_page ,max_rows]
198:         end
199:       end
200: 
201:       ## type 2 
202:       yield_skip_page_r(limit, max_rows, page, opts) do |i_limit, skip, current_page, max_page, new_opts|        new_opts["skip"] = skip        new_opts["limit"] = i_limit        uri = gen_view_uri(new_opts)        $stderr.puts "[debug] pages() uri=#{uri}" if @debug                resset = YALTools::YaJsonRows.new(@couch, @dbname)        json = @couch.get(uri)        json.has_key?("rows") and yield_rows(json["rows"]) do |doc|          if with_attachments and doc.has_key?("_attachments")            resset << get_page_with_attachment(doc)            else            resset << doc          end        end        if do_iterate          yield [resset.reverse, (max_rows - skip - i_limit), (max_page - current_page + 1), max_page ,max_rows]        else          return [resset.reverse, (max_rows - skip - i_limit), (max_page - current_page + 1), max_page ,max_rows]        end      end=end
203:     end
total_numrows(json, opts={}) click to toggle source

returns the total number of results.

  total_numrows(json) #=> 30

It accepts the following json data format;

  • {“total_rows“:11,“offset”:0,“rows”:[]}

  • {“rows”:[{“key”:null,“value”:10}]}

  • {“group_numrows“:“1”}

    • {“rows":} (if group_numrows is not implemented)

The “group_numrows“ is special case. Please see [github.com/YasuhiroABE/CouchDB-Group_NumRows].

     # File yalt/yaview.rb, line 260
260:     def total_numrows(json, opts={})
261:       $stderr.puts "total_numrows(json=#{json}, opts=#{opts})" if @debug
262:       ret = 0
263:       if json.kind_of?(Hash)
264:         if json.has_key?("total_rows")
265:           ret = json["total_rows"].to_i
266:           if json.has_key?("rows") and json["rows"].kind_of?(Array)
267:             i = json["rows"].length 
268:             ret = i if i > 0
269:           end
270:         elsif json.has_key?("rows") and json["rows"][0].kind_of?(Hash) and json["rows"][0].has_key?("value")
271:           if json["rows"].size == 1 and json["rows"][0].has_key?("key") and json["rows"][0]["key"] == nil
272:             ret = json["rows"][0]["value"].to_i
273:           elsif opts.has_key?("group_numrows") and opts["group_numrows"].to_s == "true"
274:             ## if group_numrows is not implemented.
275:             ret = json["rows"].size
276:           end
277:         elsif json.has_key?("group_numrows")
278:           ret = json["group_numrows"].to_i
279:         end
280:       end
281: 
282:       $stderr.puts "[debug] total_numrows=#{ret}" if @debug
283:       return ret
284:     end
yield_rows(rows) click to toggle source

iterates each row of the rows array.

     # File yalt/yaview.rb, line 392
392:     def yield_rows(rows) # :yields: row
393:       rows.respond_to?(:each) and rows.each do |row|
394:         if row.has_key?("doc") and row["doc"].kind_of?(Hash)
395:           ## if include_docs=true
396:           yield row["doc"]
397:         else
398:           yield row
399:         end
400:       end
401:     end
yield_skip_page(limit, total_rows, start_page=1) click to toggle source

iterates skip and page parameters for a view query.

The max_page will be calculated in the following way.

    +------------+-------+------------------+----------+
    | total_rows | limit | total_rows/limit | max_page |
    +------------+-------+------------------+----------+
    |     10     |   4   |        2         |    3     | 
    +------------+-------+------------------+----------+
    |     10     |   5   |        2         |    2     |
    +------------+-------+------------------+----------+
    |     10     |   6   |        1         |    2     |
    +------------+-------+------------------+----------+
     # File yalt/yaview.rb, line 316
316:     def yield_skip_page(limit, total_rows, start_page=1) # :yields: limit, skip, page, max_page
317:       max_page = (total_rows.to_f / limit.to_f).ceil
318:       
319:       page = start_page <= max_page ? start_page : max_page
320:       page = 1 if page < 1 ## 'page' must be greater than one, even though max_page is zero.
321:       skip = limit * (page - 1)
322:       
323:       while page <= max_page
324:         if page == max_page
325:           tmp_r = (total_rows % limit)
326:           yield [tmp_r == 0 ? limit : tmp_r, skip,page,max_page]
327:         else
328:           yield [limit,skip,page,max_page]
329:         end
330:         skip = limit * page
331:         page += 1
332:       end
333:     end
yield_skip_page_r(unit, total_rows, start_page=1, query_opts) click to toggle source

yield_skip_page_r is a reverse version of the yield_skip_page.

It is an experimental implementation.

start_page is from one to max_page, less than one will be rounded to one.

     # File yalt/yaview.rb, line 341
341:     def yield_skip_page_r(unit, total_rows, start_page=1, query_opts) # :yields: limit, skip, page, max_page, new_query_opts
342:       opts = query_opts.dup
343:       limit = unit
344: 
345:       # swaping +startkey+, +endkey+ and "descending" options.
346:       case "#{opts.has_key?('startkey')}.#{opts.has_key?('endkey')}"
347:       when "true.true"
348:         opts["startkey"] = query_opts["endkey"]
349:         opts["endkey"] = query_opts["startkey"]
350:       when "true.false"
351:         opts["endkey"] = opts["startkey"]
352:         opts.delete("startkey")
353:       when "false.true"
354:         opts["startkey"] = opts["endkey"]
355:         opts.delete("endkey")
356:       end
357:       if opts.has_key?("descending")
358:         opts["descending"] = (opts["descending"].to_s == "true") ? "false" : "true"
359:       else
360:         opts["descending"] = "true"
361:       end
362:       
363:       max_page = (total_rows.to_f / limit.to_f).ceil
364:       
365:       sanitized_start_page = start_page > 0 ? start_page : 1
366:       page = (max_page - sanitized_start_page + 1)
367:       page = 1 if page < 1 ## 'page' must be greater than one, even though max_page is zero.
368:       skip = total_rows - (limit * (max_page - page + 1))
369:       skip = total_rows if skip > total_rows
370:       skip = 0 if skip < 0
371:       
372:       $stderr.puts "[debug] yield_skip_page_r() sanitized_start_page=#{sanitized_start_page},skip=#{skip},page=#{page}" if @debug
373:       
374:       while page >= 1
375:         $stderr.puts "[debug] yield_skip_page_r() skip=#{skip},page=#{page},max_page=#{max_page}" if @debug
376:         if skip == 0
377:           tmp_limit = total_rows % limit
378:           tmp_limit = limit if tmp_limit == 0
379:           yield [tmp_limit, skip, page, max_page, opts]
380:         else
381:           yield [limit, skip, page, max_page, opts]
382:         end
383:         page -= 1
384:         # skip = limit * (page - 1)
385:         skip -= limit
386:         skip = 0 if skip < 0
387:       end
388:     end

Disabled; run with --debug to generate this.

[Validate]

Generated with the Darkfish Rdoc Generator 1.1.6.